Author avatar

Hristo Georgiev

Handling File Upload Using Ruby on Rails 5 API

Hristo Georgiev

  • Jan 10, 2019
  • 27 Min read
  • 277,502 Views
  • Jan 10, 2019
  • 27 Min read
  • 277,502 Views
Ruby
Ruby on Rails

Introduction

Sending basic JSON data made out of strings to an API (Application Programming Interface) is, in most cases, a simple and straightforward task. But what about sending files that consist of numerous lines of binary data in various formats? Such data require a slightly different approach for sending files to the API.

In this guide we will examine the two main approaches of handling file uploads, multipart form data and base64 encoding, through a Rails 5 API application using both the paperclip and the carrierwave gems.

Approaches for sending files to a Rails 5 API

Multipart form data

Multipart forms have been around since HTML 4. They were introduced because the standard application/x-www-form-urlencoded forms did not handle bigger amounts of data sufficiently well.

According to W3C, multipart/form-data is used for forms that "contain files, non-ASCII data, and binary data." What multipart forms do differently is that they break the data into different chunks representing the various characteristics of the file (size, content type, name, contents) and the characteristics of the form's standard text and number data.

To make things clearer, let's consider the following form:

1<form action="http://localhost:3000/api/v1/items"
2      enctype="multipart/form-data"
3      method="post">
4<p>
5What is your name? <input type="text" name="submit-name"><br>
6What file are you sending? <input type="files" name="file"><br>
7</p>
8<input type="submit" value="Send"> <input type="reset">
9</form>
html

If the user enters "John" in the text input, and selects the text file "file1.txt", the browser will send back the following data:

1Content-Type: multipart/form-data; boundary=AaB03x
2
3--AaB03x
4Content-Disposition: form-data; name="submit-name"
5
6John
7--AaB03x
8Content-Disposition: form-data; name="files"; filename="file1.txt"
9Content-Type: text/plain
10
11... contents of file1.txt ...
12--AaB03x--

Every part of the form is separated by a boundary, which represents a different string (AaB03x in the example). The part itself contains binary infomation with a Content-Type. Larger files can be broken down in chuncks and assembled in the server, enabling a file to be streamed and to have its integrity maintaned in cases of connection interruprtion.

Let's consider another file. If the user selects another file "file2.gif", the browser will construct the parts as follows:

1Content-Type: multipart/form-data; boundary=AaB03x
2
3--AaB03x
4Content-Disposition: form-data; name="submit-name"
5
6John
7--AaB03x
8Content-Disposition: form-data; name="file"
9Content-Type: multipart/mixed; boundary=BbC04y
10
11--BbC04y
12Content-Disposition: file; filename="file1.txt"
13Content-Type: text/plain
14
15... contents of file1.txt ...
16--BbC04y
17Content-Disposition: file; filename="file2.gif"
18Content-Type: image/gif
19Content-Transfer-Encoding: binary
20
21...contents of file2.gif...
22--BbC04y--

Here, it can be seen that there is another part added to the form-data. This time, since the file is not in a text/plain format, it is broken down in binary, as it can be inferred from the Content-Transfer-Encoding property.

The Content-Type property gives information about the type of the file, also known as its media (MIME) type. If the Content-Type property is not defined and the file is not in text format, its format will default to application/octet-stream , which means that the final file will be binary and without a type.

When sending data to an API, it is always good to include a Content-Type to each part which contains a file, otherwise there is no way to validate the contents of the file.

Base64 encoding

Base64 for is one of the most commonly used binary to text encoding formats. It uses an algorithm to break up binary code in pieces and convert it in ASCII characters (text). It has a wide array of applications - apart from being used to encode files into text in order to send them to an API, it is also used represent images as a content source in CSS, HTML and SVG.

The structure of base64 encoded files is very simple. It consists of two parts - a MIME type (similar to the multipart Content-Type and the actual base64 encoded string:

1...//rest of the base64 text

When it comes to uploading, the file is encoded into base64 on the client and decoded on the server. The base64 string can be easily attached to a JSON object's attribute:

1 {
2  "file": {
3    "name": "file2",
4    "contents": "data:image/gif;base64, iVBORw0KGgoAAA..."
5  }
6 }
json

An API would easily be able to pick up the parameter and decode it back to binary. This makes base64-encoded file uploads convenient for APIs, since the format of the message containing the file and the way it is transferred does not differ from the standard way information is sent to an API.

Base64 encoding vs. Multipart form data

Base64-encoded files can be easily used by JSON and XML APIs since they are represented as text and can be sent through the standard application/json format. However, the encoding increases the file size by 33%, making it difficult to transfer larger files. Encoding and decoding also adds a computational overhead for both the server and the client. Therefore, base64 is suitable for sending images and small files under 100MB.

Multipart forms, on the other hand, are more "unnatural" to the APIs, since the data is encoded in a different format and requires a different way of handling. However, the increased performance with larger files and the ability to stream files makes multipart form data more desirable for uploading larger files, such as videos.

Setting up Rails API for file upload

Let's apply these concepts. First, let's set up a Rails 5 API application and scaffold a model that is going to be used.

Installing and configuring Rails 5 API

To create a Rails 5 API, you need Ruby 2.2.4 installed. After you have a suitable Ruby version, the first step is to install the newest version of Rails through your terminal/command prompt:

1gem install rails --pre --no-ri --no-rdoc
bash

This will give you the ability to run rails new using the most recent version (currently 5.0.0.beta3):

1rails _5.0.0.beta3_ new fileuploadapp --api
bash

This will create a brand new Rails 5 app, and with the --api option included in the generator, it will be all set up to be used as an API. Move to the directory of the new project:

1cd fileuploadapp
bash

Go to the Gemfile and uncomment jbuilder and rack-cors .

1# Gemfile.rb
2gem 'jbuilder', '~> 2.0'
3gem 'rack-cors'
ruby

JBuilder is used to create JSON structures for the responses from the application. In MVC terms, the JSON responses are going to be the view layer (AKA UI layer or presentation layer) of the application. Therefore, all Jbuilder-generated responses have to be put in app/views/(view for a particular controller action) .

rack-cors enables cross-origin resource sharing (CORS). Simply put, rack-cors will enable browser-based applications (AngularJS, React) and mobile applications to request information from the API. Go to the application configuration file and add the configurations for CORS:

1#config/application.rb
2module Fileuploadapp
3  class Application < Rails::Application
4    config.middleware.insert_before 0, "Rack::Cors" do
5      allow do
6        origins '*'
7        resource '*', :headers => :any, :methods => [:get, :post, :options]
8      end
9    end
10    config.api_only = true
11  end
12end
ruby

This configuration will give full access to the API (* means that everything is accepted). This is not a problem for this guide, since we are going to work only with a local server.

Install the gems:

1bundle install
bash

With the gems installed, the model is going to be scaffolded with Jbuilder-generated views and be ready to be consumed by client-side applications.

1rails g scaffold Item name:string description:string
bash

A model named Item is going to be scaffolded which has a name and description as strings. Note that there is still no information about the file - file information will be included in the model's schema at a later stage. For now, we are all set to go ahead and migrate the model into the database:

1rails db:migrate
bash

This will create a table in the database for the new model. Note that in Rails 5 you can use rails instead of rake for executing a migration command.

To continue with the guide, you have to choose the gem with which you would like to implement your file uploads. If you would like to do both, I recommended that you use version control such as git so that the implementations don't clash.

File upload using Paperclip

To get started with Paperclip, first add the gem to your Gemfile:

1#Gemfile.rb
2gem "paperclip", "~> 5.0.0.beta1"
ruby

And install it:

1bundle install
bash

Uploading a single file

First, generate a migration that will add the attachment to the database. You have to put the same name for the attachment in the model as the one you put in the generator (in this case, it is picture).

1rails g paperclip item picture
bash

After you are done, don't forget to migrate the database:

1rails db:migrate
bash

Second, add the following lines in the file of the model:

1#app/models/item.rb
2class Item < ApplicationRecord
3 has_attached_file :picture, styles: { medium: "300x300>", thumb: "100x100>" }, default_url: "/images/:style/missing.png"
4 validates_attachment :picture, presence: true
5 do_not_validate_attachment_file_type :picture
6end
ruby

Here is what each method is about:

  1. has_attached_file is the main method for adding a file attachment . The first argument is the attribute of the model that is going to be used for the file attachment (In this case it is:picture, as we know from the database migration). styles: is an optional parameter that is going to distribute the uploaded files in different folders according to their file size. In this example, medium photos will be 300x300 pixels and up and will be put into the /images/medium directory. default_url is also an optional parameter used to specify the path of a default image that will be returned if the object from the model does not have an attached image. You can put the default image in app/assets/images/(medium or thumb)/. If you are uploading files that are different from images, you may omit the optional arguments.

  2. validates_attachment can validate the content-type, presence, and size of the file. You can see how the syntax for all the validations here. In this example, only the presence is checked.

  3. do_not_validate_attachment_file_type is used as there are issues with the file type validations of paperclip.

Third, add a permitted parameter that is going to accept :picture:

1#app/controllers/items_controller.rb
2def item_params
3  params.require(:item).permit(:name, :description , :picture) # Add :picture as a permitted parameter
4end
ruby

The last step is to add the :picture to the Jbuilder view, so that when we GET an item, it will return its picture:

1json.extract! @item, :id, :name, :description, :picture, :created_at, :updated_at
ruby

Uploading multiple files

Attaching a document model to item

To make uploading multiple files using Paperclip possible, add another model that is related to the main model and that will contain the files. In the current example, let's add the capability for an item to have multiple PDF documents. First, let's generate a model:

1rails g model document item:references
2rails g paperclip document file
bash

The first generator is going to produce a model named document with a name as a string. item:references will generate a belongs_to :item reference which means that the item and the document will have a one-to-many relationship. The second generator will generate fields for the attachment.

A reference also has to be added to the Item model as well:

1class Item < ApplicationRecord
2  #app/models/item.rb
3  has_many :documents
4  #...
5end
ruby

Migrate the changes to the database schema:

1rails db:migrate
bash

With this step finished, the document model is ready to be configured. Let's add the necessarry methods and validations to add PDF documents.

1#app/models/document.rb
2class Document < ApplicationRecord
3belongs_to :item
4has_attached_file :file
5validates_attachment :file, presence: true, content_type: { content_type: "application/pdf" }
6end
ruby

This time, validates attachment checks if the document's type is application/pdf .

With this addition, the document model is ready. Next, things need to be handled in the controller in order to handle creation of multiple files.

Creating an item with documents

When a new item is created, another parameter must get sent. document_data will contain an array of data about each document, either in multipart or in base64 format:

1#app/contollers/items_controller.rb
2def item_params
3  params.require(:item).permit(:name, :description , :document_data => []) #Add :documents_data in permit() to accept an array
4end
ruby

Add the logic for creating multiple files in the controller:

1#app/controllers/items_controller.rb
2#...
3def create
4@item = Item.new(item_params)
5
6if @item.save
7  @item.save_attachments(item_params) if params[:item][:document_data]
8  render :show, status: :created, location: @item
9else
10  render json: @item.errors, status: :unprocessable_entity
11end
12#...
13end
ruby

In order to get :document_data into the model, a attr_accessor method needs to be added to the model:

1#app/models/item.rb
2class Item < ApplicationRecord
3  attr_accessor :document_data
4end
ruby

After the item is successfully saved, the document_data parameter has to be checked. If the document data array is not empty, the controller will call the save_attachments that will attempt to create the documents. Here is how the model method has to look like:

1#app/models/item.rb
2class Item < ApplicationRecord
3  #...
4  def save_attachments(params)
5    params[:document_data].each do |doc|
6      self.documents.create(:file => doc)
7    end
8  end
9  #...
10end
ruby

The save_attachments method is going to go through all the items in the document_data array. For each document, there is going to be an object from the document model created.

All done! Now it is possible to add multiple files to a document. Add :documents to the Jbuilder view so that the documents get returned together with the item.

1#app/views/items/show.json.jbuilder
2json.extract! @item, :id, :name, :description, :picture, :documents, :created_at, :updated_at
ruby

base64 upload

Suppose you want to create items and documents by sending in base64-encoded strings instead of form data. There are several additional steps that should be done for this to be achieved. As mentioned before, base64 involves much more overhead computation work and much more memory, but it is more easily integrated into the API.

Modifying the item

First, the attr_accessor method will be used to get the base64 encoded string in the model:

1 #app/models/item.rb
2class Item < ApplicationRecord
3attr_accessor :image_base
4    #...
5end
ruby

Second, add a private method for decoding image_base and assigning it to the picture attribute of the item.

1  #app/models/item.rb
2
3private
4  def parse_image
5  image = Paperclip.io_adapters.for(image_base)
6  image.original_filename = "file.jpg"
7  self.picture = image
8end
ruby

The parse_image method takes image_base and puts it into Paperclip's IO adapters. They contain a registry which can turn the base64 string into binary. Because the file name is not stored, you can either put an arbitrary value (like "file.jpg", even if your file is not in jpg format) or add another attr_accessor for the name itself. Finally, self.picture = image assigns the image to the current instance of the object.

The method is ready, but it has to be called every time a new object is created. Let's add a filter that will call the parse_image method when that happens:

1 #app/models/item.rb
2class Item < ApplicationRecord
3  before_validation :parse_image
4  attr_accessor :image_base
5  #...
6end
ruby

The before_validation method will ensure that the base64 string will be parsed and assigned to the object instance before Paperclip validates the size, content type, and the rest of the attributes of the file.

In order to get image_base, it has to be entered as a parameter in the controller:

1#app/controllers/items_controller.rb
2def item_params
3  params.require(:item).permit(:name, :description , :image_base , :document_data => []) #change :picture to :image_base
4end
ruby

Modifying the documents

Similarly to the item, the document data has to be decoded before being saved. So instead of sending a :file directly, the :file_contents have to be sent:

1  #app/models/item.rb
2  def save_attachments(params)
3  params[:document_data].each do |doc|
4    self.documents.create(:file_contents => doc) #change from :file to :file_contents
5  end
6end
ruby

Similarly to the item model, add attr_accessor for the :file_contents and a method to decode the base64-encoded content:

1 #app/models/document.rb
2 class Document < ApplicationRecord
3  belongs_to :item
4  before_validation :parse_file
5  attr_accessor :file_contents
6  has_attached_file :file
7  validates_attachment :file, presence: true, content_type: { content_type: "application/pdf" }
8
9
10  private
11
12  def parse_file
13    file = Paperclip.io_adapters.for(file_contents)
14    file.original_filename = "pdfile.pdf"
15    self.file = file
16  end
17end
ruby

File upload using CarrierWave

Add the Carrierwave gem to your Gemfile:

1#Gemfile.rb
2gem 'carrierwave'
ruby

Run the installation:

1bundle install
bash

Uploading a single file

CarrierWave isolates the logic of the uploaded files from the logic of the models using uploaders. To generate an uploader for the picture of the Item model, run the following command:

1rails g uploader Picture
2rails g migration add_picture_to_items picture:string
bash

The first generator will create a new directory in the application - app/uploaders . This is where all the uploaders are going to be stored. The second generator will add a picture column as string in the Items table which will store a reference to the file.

Don't forget to migrate the database:

1rails db:migrate
bash

In the generated uploader, you can see various configuration options. Right now we need the file type validation option. Since the uploader is used to upload pictures, the content types of each picture need to be whitelisted (or approved):

1#app/uploaders/picture_uploader
2 def extension_white_list  
3   %w(jpg jpeg gif png)
4 end
ruby

Now, mount the uploader to the item model:

1class Item < ApplicationRecord
2 mount_uploader :picture, PictureUploader
3end
ruby

When a new model instance is created, the uploader will automatically associate the picture with it. The picture will contain the url with the image, which can be reached through Item.picture.url.

Next, enter the :picture parameter as a parameter:

1 #app/controllers/items_controllerr.rb
2 def item_params
3  params.require(:item).permit(:name, :description :picture) # Add :picture as a permitted paramter
4 end
ruby

Last, add the :picture to the Jbuilder view, so that when we GET an item, the API will return its picture:

1 #app/views/items/show.json.jbuilder
2json.extract! @item, :id, :name, :description, :picture, :created_at, :updated_at
ruby

Uploading multiple files

Let's add PDF documents to the item. This will require adding another uploader as well as a model named document belonging to the item. To accomplish this task, run these two generators:

1rails g model document item:references document:string
2rails g uploader Document
bash

The first generator creates the document model with references to the Item and to a file column that gives a reference to its CarrierWave file. The second generator creates the uploader in the app/uploaders directory. Add the new document model in the database schema:

1rails db:migrate
bash

Mount the uploader in the document model:

1 #app/model/document.rb
2class Document < ApplicationRecord
3  belongs_to :item
4  mount_uploader :document, DocumentUploader
5end
ruby

Because the documents will be PDF only, adjust the whitelist accordingly in the uploader:

1#app/uploaders/document_uploader.rb
2def extension_white_list
3    %w(pdf)
4end
ruby

In the Item model, add the following lines:

1 #app/model/item.rb
2class Item < ApplicationRecord
3  mount_uploader :picture, PictureUploader
4  has_many :documents
5  attr_accessor :document_data
6end
ruby

has_many :documents adds a relation one-to-many between the item and the documents. attr_accessor :document_data will let us send extra attributes to the controller. This attribute is going to contain an array with data about every PDF document.

We are done with the model! Let's proceed to the controllers:

  1. Add the documentdata array as a permitted parameter. _Array parameters must always be the last parameter:
1#app/controllers/item_controller.rb
2def item_params
3  params.require(:item).permit(:name, :description, :picture, :document_data => []) #add document_data as a permitted parameter
4end
ruby
  1. Add a loop that will create documents after the item is saved:
1#app/controllers/item_controller.rb
2 def create
3   @item = Item.new(item_params)
4
5   if @item.save
6     #iterate through each of the files
7     params[:item][:document_data].each do |file|
8         @item.documents.create!(:document => file)
9         #create a document associated with the item that has just been created
10     end
11     render :show, status: :created, location: @item
12   else
13     render json: @item.errors, status: :unprocessable_entity
14   end
15 end
ruby

Finally, update your Jbuilder view, so that the documents get returned together with the item:

1 #app/views/items/show.json.jbuilder
2json.extract! @item, :id, :name, :description, :picture, :documents, :created_at, :updated_at
ruby

base 64 upload

Uploading base64-encoded files is a very easy task with Carrierwave. Just add the carrierwave-base64 gem:

1# Gemfile.rb
2gem 'carrierwave-base64'
ruby

Install it:

1bundle install
bash

Go to your models and replace mount_uploader with mount_base64_uploader

1#app/models/item.rb
2mount_base64_uploader :picture, PictureUploader
ruby
1#app/models/document.rb
2mount_base64_uploader :document, DocumentUploader
ruby

And voila, your application can now accept base64-encoded files. It is as simple as that.

Format of the request

Using multipart form data

For POSt-ing multipart form data, cURL is convenient enough. Keep in mind that the paths included are arbitrary and you have to add your own. Another thing - when sending files, make sure to declare their type (type=application/pdf). Otherwise, cURL sends files as a binary type ( application/octet-stream ); this may break some of the validations.

1curl
2-F "item[document_data][]=@E:ile2.pdf;type=application/pdf"
3-F "item[document_data][]=@E:ile1.pdf;type=application/pdf"
4-F "item[picture]=@E:photo.jpg"
5-F "item[name]=item"
6-F "item[description]=desc"  
7localhost:3000/items
bash

Using base64 strings in JSON

If you want to send base64 data, keep in mind that the strings are long, and cURL might be tedious to use. I would recommend Postman . Note: Replace "picture" with " image_base" if you used Paperclip. Here is the JSON format that the API is going to accept:

1{
2  "item": {
3    "name": "tem",
4    "description": "desc",
5    "picture": "data:image/png;base64, Rt2...",
6    "document_data": [
7      "data:application/pdf;base64, fW3...",
8      "data:application/pdf;base64, pLf..."
9    ]
10  }
11}
json

Conclusion

Now you know how to upload files. You can now go ahead and set up a client-side application that can consume the API. In case you got lost in the guide, I created a Github repository and added branches for each of the steps and the implementations done in the tutorial.

My personal preference for file upload is CarrieWave because it offers a more straightforward approach. The logic for the file is isolated into uploaders, which can be reused repeatedly for multiple models. This makes the code more maintinable. It took me considerably less time to implement file uploading for both single and multiple items. With the CarrierWave-base64 gem, I did not have to add any custom logic for decoding and saving images - changing the name of the mounting method was enough. The only significant drawback is that CarrierWave has a lot of "magic" going behind it. It does not feel as customizable as Paperclip, and debugging on CarrierWave seems more difficult.

On the other hand, Paperclip offers a more personalized approach to file uploads. All the logic is stored inside the model, and there is no need to encapsulate the logic outside the model. Files are treated more like standard fields in the model rather than independent parts that are merely associated with the model. Debugging is somewhat intuitive, since any errors are simply treated as attribute validation errors. On the other hand, making the image uploading more generic requires more tinkering and experience (putting methods in ApplicationRecord is one way). Moreover, Base64 uploads do not come as naturally as they do for CarrierWave, so there is a need for custom logic.

To conclude, both CarrierWave and Paperclip have their pros and cons, and the choice of which to use depends on the case. Both gems are popular and have big communities built around them. The same applies for the different ways of uploading files - both base64 and multipart forms are widely-used means of uploading files to APIs. The choice depends on the scope and requirements of the project you are working on.