Efficient photos upload architecture
This post is a bit more technical, describing how we achieved such fast images upload.
Since Bablab is focused on portfolio websites, our users have a relatively large amount of images in their websites. This makes images uploading a key feature that demands good performance and control, as well as efficiency aspects to consider.
Pre-upload processing
There are few tasks we would prefer to execute before uploading the images, taking advantage of the fact the images are available locally, and we have all the CPU power of a desktop machine.
Resizing
Images may be really big, in terms of dimensions and their file size as well. We would want to limit file size and sometimes resize them to our maximum width and height. This will save up disk space on the cloud, and reduce bandwidth used for uploading.
* We upload one instance of the image in the maximum size and quality. Once uploaded we have serverless functions for creating sizes and formats, these procedures are triggered and executed separately and do not affect uploading speed.
Extracting meta-data
Images coming out of Lightroom/CaptureOne will probably have meta-data such as Exif and XMP. Many uploading libraries strip this data, however, we would like to keep it as it contains valuable info. We extract it beforehand in order to save it in our database for future editing.
We can also capture the natural width and height at this stage, to have all image-related data available before actually uploading it.
We would save all this data locally to be available for further processing.
Uploading directly to the cloud
Since all the images are processed in the browser, and would be hosted on the cloud, there is no point in uploading them to our server first, the upload should be directly to the cloud.
This is done using a server generated pre-signed url (for example, a php library for producing pre-signed upload form inputs for AWS S3: https://github.com/eddturtle/direct-upload), and a Javascript upload library that support multiple files, chunked and resumable file uploads (we use https://github.com/blueimp/jQuery-File-Upload, though we would be happy to rid jQuery all together at some point).
[update]
We have finally replaced the files upload library, got rid of the huge jQuery library and replaced it with our own vanilla Javascript super fast upload class: https://github.com/Adifmac/Direct-browser-s3-uploader.
Updating our server
Each image upload returns its cloud URL, we then match it with the meta-data we stored locally. Once all images are uploaded we can update our server with all the new images, including all meta-data, in a single request.
Summary
With this architecture we efficiently support uploading several dozens of images at once, directly in the browser, without disrupting the user’s flow of actions. While the effect on our server is minimal.