Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

File and Folder Models for a Simple File Storage Service Using VueJS, Flask, and RethinkDB

Aug 8, 2018 • 13 Minute Read

File and Folder Models

We'll be creating simple models for working with the files and folders similar to what we did with the User model. We'll create a Folder model as a child of the File model.

For more information about how to set up your workspace, checkout the first guide in this series: Introduction and Setup to Building a Simple File Storage Service Using VueJS, Flask and RethinkDB.

File Model

What you will notice as we proceed is that the fact files are stored in a flat manner in the filesystem. All users have a folder where all their files are stored but the structure of the data is logical and stored in the database. This way, we have minimal writes on the file system. To do this we will be employing some pretty neat techniques that will probably be useful to you for future projects.

Create the base models in /api/models.py

      class File(RethinkDBModel):
  _table = 'files'

class Folder(File):
  pass
    

We start out by creating the create() method for the File model. This method will be called when we make a POST request to the /users/<user_id>/files/<file_id> endpoint used to create a file.

      @classmethod
def create(cls, **kwargs):
  name = kwargs.get('name')
  size = kwargs.get('size')
  uri = kwargs.get('uri')
  parent = kwargs.get('parent')
  creator = kwargs.get('creator')

  # Direct parent ID
  parent_id = '0' if parent is None else parent['id']

  doc = {
     'name': name,
     'size': size,
     'uri': uri,
     'parent_id': parent_id,
     'creator': creator,
     'is_folder': False,
     'status': True,
     'date_created': datetime.now(r.make_timezone('+01:00')),
     'date_modified': datetime.now(r.make_timezone('+01:00'))
  }

  res = r.table(cls._table).insert(doc).run(conn)
  doc['id'] = res['generated_keys'][0]

  if parent is not None:
     Folder.add_object(parent, doc['id'])

  return doc
    

First, we collect all the information we need from the keyword arguments dictionary like name, size, file URI, creator, and so on. We have collected a parameter which we are calling parent. This field points to the id of the folder in which we want to store this file. We can choose not to pass this parameter if we want to store the file in the root folder. As we go on, you will understand how we can make use of this to create complex nested folder structures.

Notice here how having a parent of None makes the parent_id field 0. This takes care of cases when a file is created with no parent. This assumes that we are storing the file in the root folder which will have an ID of 0.

We collect all of this information about the file into a dictionary and call the insert() function to store it in the database. The returned dictionary from calling the insert function contains the IDs of the newly generated documents. After inserting the dictionary, we populate the ID information in the dictionary so that we can return it to the users.

In the last three lines of this method, we've added in a check to see if the parent is None. Since this file manager implementation has folders, whenever we create a file in a folder, we would have to logically add each newly created object to a folder. We do this by adding the object ID into an objects list in the corresponding record for the folder we're trying to store it in. This is done by calling a method which we will create in the Folder model called add_object.

Next up, we will go back to our base RethinkDBModel class to create a number of useful methods which we may or may not override in the child classes.

      class RethinkDBModel(object):
  @classmethod
  def find(cls, id):
     return r.table(cls._table).get(id).run(conn)

  @classmethod
  def filter(cls, predicate):
     return list(r.table(cls._table).filter(predicate).run(conn))

  @classmethod
  def update(cls, id, fields):
     status = r.table(cls._table).get(id).update(fields).run(conn)
     if status['errors']:
         raise DatabaseProcessError("Could not complete the update action")
     return True

  @classmethod
  def delete(cls, id):
     status = r.table(cls._table).get(id).delete().run(conn)
     if status['errors']:
         raise DatabaseProcessError("Could not complete the delete action")
     return True
    

Here we created wrapper methods for the RethinkDB get(), filter(), update(), and delete() functions. This way, subclasses can leverage on those functions for more complex interactions.

The next method we will be creating in our File model is a function that will be used to move files between folders.

      @classmethod
def move(cls, obj, to):
  previous_folder_id = obj['parent_id']
  previous_folder = Folder.find(previous_folder_id)
  Folder.remove_object(previous_folder, obj['id'])
  Folder.add_object(to, obj['id'])
    

The logic here is fairly simple. We call this method when we want to move a file obj into folder to.

We start by getting the current folder ID for the current parent directory of the file. This is stored in the parent_id field of obj. We call the Folder model find function to obtain the folder object as a dictionary called previous_folder. After getting this object, we do two things. We remove the file object from the previous folder previous_folder, and add the file object to the new folder to. We achieve this by calling the remove_object() and add_object() methods of the Folder class. These methods remove the file ID from and add the file ID to the objects list in the Folder document, respectively. I will be showing what the implementations for these looks like in a bit.

We're now done with modeling the files. We can carry out basic interactions on files like creating, editing, deleting from the database, and more.

Next we move on to the logic for the Folder model which is very similar to what we did for the files.

Folder Model

      @classmethod
def create(cls, **kwargs):
  name = kwargs.get('name')
  parent = kwargs.get('parent')
  creator = kwargs.get('creator')

  # Direct parent ID
  parent_id = '0' if parent is None else parent['id']

  doc = {
     'name': name,
     'parent_id': parent_id,
     'creator': creator,
     'is_folder': True,
     'last_index': 0,
     'status': True,
     'objects': None,
     'date_created': datetime.now(r.make_timezone('+01:00')),
     'date_modified': datetime.now(r.make_timezone('+01:00'))
  }

  res = r.table(cls._table).insert(doc).run(conn)
  doc['id'] = res['generated_keys'][0]

  if parent is not None:
     cls.add_object(parent, doc['id'], True)

  cls.tag_folder(parent, doc['id'])

  return doc

@classmethod
def tag_folder(cls, parent, id):
  tag = id if parent is None else '{}#{}'.format(parent['tag'], parent['last_index'])
  cls.update(id, {'tag': tag})
    

The create() method here is very similar to what we have for files, with a few modifications. The first thing to notice, obviously, is the fact that we only need the name and the creator of the folder to create a folder. We have used similar logic for determining the parent folder and also used the add_object() method to add the folder to the parent folder in the end.

I have included the is_folder field here which defaults to True for folders and False for files.

You will notice here that we have called a tag_folder() method. We will be needing this later on for dealing with moving folders. To summarise, folders are tagged according to their position on the File tree. The indexes are based on their level on the tree. Any folder stored at the root level will have a tag of <id> where id is the folder's ID. Any folder stored below that folder will have an id of <id>-n where n is an integer that increments continuously. Subsequently nested folders will follow the same pattern and have ids of <id>-n-m etc. n will change as we add more folders and we store the data required for this in the last_index field of each folder which defaults to 0. As we add folders to this folder, we will be incrementing the value for last_index. The tag_folder() method takes care of all this.

We insert the dictionary we created to house all this data into the database by calling the insert() function.

Next, we will be overriding the find method of the File class to include functionality to show listing information for folders. This will be very useful for our front-end later on.

      @classmethod
def find(cls, id, listing=False):
  file_ref = r.table(cls._table).get(id).run(conn)
  if file_ref is not None:
     if file_ref['is_folder'] and listing and file_ref['objects'] is not None:
         file_ref['objects'] = list(r.table(cls._table).get_all(r.args(file_ref['objects'])).run(conn))
  return file_ref
    

We start by getting the object, using its ID. We show a folder's listings on fulfilling three conditions:

  • listing is set to True. We use this variable to know if we actually want information about the files contained in a folder.
  • The file_ref object is actually a folder. We determine this by checking the is_folder field of the document.
  • We have objects within the objects list of the folder document.

If all these conditions are fulfilled, we get all the nested objects by calling the get_all method on the file table. This method accepts multiple keys and returns all the objects with the respective keys. We use the r.args method to convert the list of objects to multiple arguments for the get_all method. We replace the objects field of the document with the list returned. This list contains the details of each of these nested files/folders.

Next, we move on to create the move method for the folder. This is going to be very similar to the move method we created for files earlier on with the inclusion of logic for working with tags and ensuring we can actually move the folder.

      @classmethod
def move(cls, obj, to):
  if to is not None:
     parent_tag = to['tag']
     child_tag = obj['tag']

     parent_sections = parent_tag.split("#")
     child_sections = child_tag.split("#")

     if len(parent_sections) > len(child_sections):
         matches = re.match(child_tag, parent_tag)
         if matches is not None:
             raise Exception("You can't move this object to the specified folder")

  previous_folder_id = obj['parent_id']
  previous_folder = cls.find(previous_folder_id)
  cls.remove_object(previous_folder, obj['id'])

  if to is not None:
     cls.add_object(to, obj['id'], True)
    

Here we first ensure that the folder we're moving to was in fact specified and is not None. This is because the assumption here is that if this is not specified we are actually moving this folder to the root folder.

We get the tag of the folder we're trying to move. We also get the tag of the folder we're trying to move it to. Then we compare the number of sections in their tags. This is how we know the folder's level in the file tree. There is only one case where moving is not so straightforward: when there are more parent sections than child sections (Parent in this case refers to the folder we're trying to move this folder to). We can move a folder to any folder on its level and above but if the parent_sections is more than the child_sections, we know that it is possible that the folder to which we're trying to move this folder might be nested in its own folder. We are very careful about this because, as mentioned earlier, folder structure is purely logical and we need to ensure we don't have errors with this.

In the case that the folder we're moving to is below the folder we're moving in the file tree, we must ensure that the former folder is not nested in the latter. This can be done by ensuring the child_tag of the folder we're moving does not begin the parent_tag string. We use regex to implement this and raise an exception if this happens.

We're almost done now! Finally we will create the add_object() and remove_object() methods I had referred to earlier.

      @classmethod
def remove_object(cls, folder, object_id):
  update_fields = folder['objects'] or []
  while object_id in update_fields:
     update_fields.remove(object_id)
  cls.update(folder['id'], {'objects': update_fields})

@classmethod
def add_object(cls, folder, object_id, is_folder=False):
  p = {}
  update_fields = folder['objects'] or []
  update_fields.append(object_id)
  if is_folder:
     p['last_index'] = folder['last_index'] + 1
  p['objects'] = update_fields
  cls.update(folder['id'], p)
    

As mentioned earlier, we will be doing the add and remove operations by modifying the objects list in the folder object. When adding subfolders, we put in a constraint to also update the last_index variable of the folder.

Next Steps

We're now done with the models. Next up, controllers! Continue on to the next guide in this series: File Controller and Finishing.