This post was written by Amin Shah Gilani, JavaScript developer for Toptal.

I love building things—what developer doesn’t? I love thinking up solutions to interesting problems, writing implementations, and creating beautiful code. However, what I don’t like is operations. Operations is everything not involved in building great software—everything from setting up servers to getting your code shipped to production.

This is interesting, because as a freelance Ruby on Rails developer, I frequently have to create new web applications and repeat the process of figuring out the DevOps side of things. Fortunately, after creating dozens of applications, I’ve finally settled on a perfect initial deployment pipeline. Unfortunately, not everyone’s got it figured out like I have—eventually, this knowledge led me to take the plunge and document my process.

In this article, I’ll walk you through my perfect pipeline to use at the beginning of your project. With my pipeline, every push is tested, the master branch is deployed to staging with a fresh database dump from production, and versioned tags are deployed to production with back-ups and migrations happening automatically.

Note, since it’s my pipeline, it’s also opinionated and suited to my needs; however, you can feel free to swap out anything you don’t like and replace it with whatever strikes your fancy. For my pipeline, we’ll use:

  • GitLab to host code.
    • Why: My clients prefer their code to remain secret, and GitLab’s free tier is wonderful. Also, integrated free CI is awesome. Thanks GitLab!
    • Alternatives: GitHub, BitBucket, AWS CodeCommit, and many more.
  • GitLab CI to build, test, and deploy our code.
    • Why: It integrates with GitLab and is free!
    • Alternatives: TravisCI, Codeship, CircleCI, DIY with Fabric8, and many more.
  • Heroku to host our app.
    • Why: It works out of the box and is the perfect platform to start off on. You can change this in the future, but not every new app needs to run on a purpose-built Kubernetes cluster. Even Coinbase started off on Heroku.
    • Alternatives: AWS, DigitalOcean, Vultr, DIY with Kubernetes, and many more.

Old-school: Create a Basic App and Deploy It to Heroku

First, let’s recreate a typical application for someone who isn’t using any fancy CI/CD pipelines and just wants to deploy their application.

Diagram of traditional code hosting and deploying actions

It doesn’t matter what kind of app you’re creating, but you will require Yarn or npm. For my example, I’m creating a Ruby on Rails application because it comes with migrations and a CLI, and I already have the configuration written for it. You’re welcome to use any framework or language you prefer, but you’ll need Yarn to do the versioning I do later on. I’m creating a simple CRUD app using only a few commands and no authentication.

 And let’s test if our app is running as expected. I went ahead and created a few posts, just to make sure.
The application running in development

And let’s deploy it to Heroku by pushing our code and running migrations

$ heroku create toptal-pipeline
Creating ⬢ toptal-pipeline... done
https://toptal-pipeline.herokuapp.com/ | https://git.heroku.com/toptal-pipeline.git
$ git push heroku master
Counting objects: 132, done.
...
To https://git.heroku.com/toptal-pipeline.git
 * [new branch]      master -> master
$ heroku run rails db:migrate
Running rails db:migrate on ⬢ toptal-pipeline... up, run.9653 (Free)
...

Finally let’s test it out in production

The application running in production

And that’s it! Typically, this is where most developers leave their operations. In the future, if you make changes, you would have to repeat the deploy and migration steps above. You may even run tests if you’re not running late for dinner. This is great as a starting point, but let’s think about this method a bit more.

Pros

  • Quick to set up.
  • Deployments are easy.

Cons

  • Not DRY: Requires repeating the same steps on every change.
  • Not versioned: “I’m rolling back yesterday’s deployment to last week’s” isn’t very specific three weeks from now.
  • Not bad-code-proof: You know you’re supposed to run tests, but no one’s looking, so you might push it despite the occasional broken test.
  • Not bad-actor-proof: What if a disgruntled developer decides to break your app by pushing code with a message about how you don’t order enough pizzas for your team?
  • Does not scale: Allowing every developer the ability to deploy would give them production level access to the app, violating the Principle of Least Privilege.
  • No staging environment: Errors specific to the production environment won’t show up until production.

The Perfect Initial Deployment Pipeline

I’m going to try something different today: Let’s have a hypothetical conversation. I’m going to give “you” a voice, and we’ll talk about how we can improve this current flow. Go ahead, say something.

Say what? Wait—I can talk?

Yes, that’s what I meant about giving you a voice. How are you?

I’m good. This feels weird

I understand, but just roll with it. Now, let’s talk about our pipeline. What’s the most annoying part about running deployments?

Oh, that’s easy. The amount of time I waste. Have you ever tried pushing to Heroku?

Yeah, watching your dependencies downloading and application being built as part of the git push is horrible!

I know, right? It’s insane. I wish I didn’t have to do that. There’s also the fact that I have to run migrations *after* deployment so I have to watch the show and check to make sure my deployment runs through

Okay, you could actually solve that latter problem by chaining the two commands with &&, like git push heroku master && heroku run rails db:migrate, or just creating a bash script and putting it in your code, but still, great answer, the time and repetition is a real pain.

Yeah, it really sucks

What if I told you you could fix that bit immediately with a CI/CD pipeline?

A what now? What is that?

CI/CD stands for continuous integration (CI) and continuous delivery/deployment (CD). It was fairly tough for me to understand exactly what it was when I was starting out because everyone used vague terms like “amalgamation of development and operations,” but put simply:

  • Continuous Integration: Making sure all your code is merged together in one place. Get your team to use Git and you’ll be using CI.
  • Continuous Delivery: Making sure your code is continuously ready to be shipped. Meaning producing read-to-distribute version of your product quickly.
  • Continuous Deployment: Seamlessly taking the product from continuous delivery and just deploying it to your servers.

Oh, I get it now. It’s about making my app magically deploy to the world!

My favorite article explaining CI/CD is by Atlassian here. This should clear up any questions you have. Anyways, back to the problem.

Yeah, back to that. How do I avoid manual deploys?

Setting Up a CI/CD Pipeline to Deploy on Push to master

What if I told you you could fix that bit immediately with a CI/CD? You can push to your GitLab remote (origin) and a computer will be spawned to straight-up simply push that code of yours to Heroku.

No way!

Yeah way! Let’s jump back into code again.

Diagram of a simple deploy CI/CD pipeline

Create a .gitlab-ci.yml with the following contents, swapping out toptal-pipeline for your Heroku app’s name:

image: ruby:2.4

before_script:
  - >
   : "${HEROKU_EMAIL:?Please set HEROKU_EMAIL in your CI/CD config vars}"
  - >
   : "${HEROKU_AUTH_TOKEN:?Please set HEROKU_AUTH_TOKEN in your CI/CD config vars}"
  - curl https://cli-assets.heroku.com/install-standalone.sh | sh
  - |
    cat >~/.netrc <<EOF
    machine api.heroku.com
      login $HEROKU_EMAIL
      password $HEROKU_AUTH_TOKEN
    machine git.heroku.com
      login $HEROKU_EMAIL
      password $HEROKU_AUTH_TOKEN
    EOF
  - chmod 600 ~/.netrc
  - git config --global user.email "ci@example.com"
  - git config --global user.name "CI/CD"

variables:
  APPNAME_PRODUCTION: toptal-pipeline

deploy_to_production:
  stage: deploy
  environment:
    name: production
    url: https://$APPNAME_PRODUCTION.herokuapp.com/
  script:
    - git remote add heroku https://git.heroku.com/$APPNAME_PRODUCTION.git
    - git push heroku master
    - heroku pg:backups:capture --app $APPNAME_PRODUCTION
    - heroku run rails db:migrate --app $APPNAME_PRODUCTION
  only:
    - master

Push this up, and watch it fail in your project’s Pipelines page. That’s because it’s missing the authentication keys for your Heroku account. Fixing that is fairly straightforward, though. First you’ll need your Heroku API key. Get it from the Manage Account page, and then add the following secret variables in your GitLab repo’s CI/CD settings:

  • HEROKU_EMAIL: The email address you use to sign into Heroku
  • HEROKU_AUTH_KEY: The key you got from Heroku
Image of the secret variables in the GitLab CI/CD settings page

This should result in a working GitLab to Heroku deploying on every push. As to what’s happening:

  • Upon pushing to master
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database is captured in Heroku.
    • Migrations are run.

Already, you can see that not only are you saving time by automating everything to a git push, you’re also creating a backup of your database on every deploy! If anything ever goes wrong, you’ll have a copy of your database to revert back to.

Creating a Staging Environment

But wait, quick question, what happens to your production-specific problems? What if you run into a weird bug because your development environment is too different from production? I once ran into some odd SQLite 3 and PostgreSQL issues when I ran a migration. The specifics elude me, but it’s quite possible.

I strictly use PostgreSQL in development, I never mismatch database engines like that, and I diligently monitor my stack for potential incompatibilities.

Well, that’s tedious work and I applaud your discipline. Personally, I’m much too lazy to do that. However, can you guarantee that level of diligence for all potential future developers, collaborators, or contributors?

Errrr— Yeah, no. You got me there. Other people will mess it up. What’s your point, though?

My point is, you need a staging environment. It’s like production but isn’t. A staging environment is where you rehearse deploying to production and catch all your errors early. My staging environments usually mirror production, and I dump a copy of the production database on staging deploy to ensure no pesky corner cases mess up my migrations. With a staging environment, you can stop treating your users like guinea pigs.

This makes sense! So how do I do this?

Here’s where it gets interesting. I like to deploy master directly to staging.

Wait, isn’t that where we’re deploying production right now?

Yes it is, but now we’ll be deploying to staging instead.

But if master deploys to staging, how do we deploy to production?

By using something you should’ve been doing years ago: Versioning our code and pushing Git tags.

Git tags? Who uses Git tags?! This is beginning to sound like a lot of work.

It sure was, but thankfully, I’ve done all that work already and you can just just dump my code and it’ll work.

Overview of how staging and production deploys will work

First, add a block about the staging deploy to your .gitlab-ci.yml file, I’ve created a new Heroku app called toptal-pipeline-staging:



variables:
  APPNAME_PRODUCTION: toptal-pipeline
  APPNAME_STAGING: toptal-pipeline-staging


deploy_to_staging:
  stage: deploy
  environment:
    name: staging
    url: https://$APPNAME_STAGING.herokuapp.com/
  script:
    - git remote add heroku https://git.heroku.com/$APPNAME_STAGING.git
    - git push heroku master
    - heroku pg:backups:capture --app $APPNAME_PRODUCTION
    - heroku pg:backups:restore `heroku pg:backups:url --app $APPNAME_PRODUCTION` --app $APPNAME_STAGING --confirm $APPNAME_STAGING
    - heroku run rails db:migrate --app $APPNAME_STAGING
  only:
    - master
    - tags

...

Then change the last line of your production block to run on semantically versioned Git tags instead of the master branch:

deploy_to_production:
...
  only:
    - /^v(?'MAJOR'(?:0|(?:[1-9]\d*)))\.(?'MINOR'(?:0|(?:[1-9]\d*)))\.(?'PATCH'(?:0|(?:[1-9]\d*)))(?:-(?'prerelease'[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?(?:\+(?'build'[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?$/
    # semver pattern above is adapted from https://github.com/semver/semver.org/issues/59#issuecomment-57884619

Running this right now will fail because GitLab is smart enough to only allow “protected” branches access to our secret variables. To add version tags, go to your GitLab project’s repository settings page and add v* to protected tags.

Image of the version tag being added to protected tags in the repository settings page

Let’s recap what’s happening now:

  • Upon pushing to master, or pushing a tagged commit
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • The backup is dumped in your staging environment.
    • Migrations are run on the staging database.
  • Upon pushing a semantically version tagged commit
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • Migrations are run on the production database.

Do you feel powerful now? I feel powerful. I remember, the first time I came this far, I called my wife and explained this entire pipeline in excruciating detail. And she’s not even technical. I was super impressed with myself, and you should be too! Great job coming this far!

Testing Every Push

But there’s more, since a computer’s doing stuff for you anyways, it could also run all the things you’re too lazy to do: Tests, linting errors, pretty much anything you want to do, and if any of these fail, they won’t move on to deployment.

I love having this in my pipeline, it makes my code reviews fun. If a merge request gets through all my code-checks, it deserves to be reviewed.

Image of Testing on every push

Add a test block:

test:
  stage: test
  variables:
    POSTGRES_USER: test
    POSTGRES_PASSSWORD: test-password
    POSTGRES_DB: test
    DATABASE_URL: postgres://${POSTGRES_USER}:${POSTGRES_PASSSWORD}@postgres/${POSTGRES_DB}
    RAILS_ENV: test
  services:
    - postgres:alpine
  before_script:
    - curl -sL https://deb.nodesource.com/setup_8.x | bash
    - apt-get update -qq && apt-get install -yqq nodejs libpq-dev
    - curl -o- -L https://yarnpkg.com/install.sh | bash
    - source ~/.bashrc
    - yarn
    - gem install bundler  --no-ri --no-rdoc
    - bundle install -j $(nproc) --path vendor
    - bundle exec rake db:setup RAILS_ENV=test
  script:
    - bundle exec rake spec
    - bundle exec rubocop

Let’s recap what’s happening now:

  • Upon every push, or merge request
    • Ruby and Node are set up in a container.
    • Dependencies are installed.
    • The app is tested.
  • Upon pushing to master, or pushing a tagged commit, and only if all tests pass
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • The backup is dumped in your staging environment.
    • Migrations are run on the staging database.
  • Upon pushing a semantically version tagged commit, and only if all tests pass
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • Migrations are run on the production database.

Take a step back and marvel at the level of automation you’ve accomplished. From now on, all you have to do is write code and push. Test out your app manually in staging if you feel like it, and when you feel confident enough to push it out to the world, tag it with semantic versioning!

Automatic Semantic Versioning

Yeah, it’s perfect, but there’s something missing. I don’t like looking up the last version of the app and explicitly tagging it. That takes multiple commands and distracts me for a few seconds.

Okay, dude, stop! That’s enough. You’re just over-engineering it now. It works, it’s brilliant, don’t ruin a good thing by going over the top.

Okay, I have a good reason for doing what I’m about to do.

Pray, enlighten me.

I used to be like you. I was happy with this setup, but then I messed up. git tag lists tags in alphabetical order, v0.0.11 is above v0.0.2. I once accidentally tagged a release and continued doing it for about half a dozen releases until I saw my mistake. That’s when I decided to automate this too.

Here we go again

Okay, so, thankfully, we have the power of npm at our disposal, so I found a suitable package: Run yarn add --dev standard-version and add the following to your package.json file:

  "scripts": {
    "release": "standard-version",
    "major": "yarn release --release-as major",
    "minor": "yarn release --release-as minor",
    "patch": "yarn release --release-as patch"
  },

Now you need to do one last thing, configure Git to push tags by default. At the moment, you need to run git push --tags to push a tag up, but automatically doing that on regular git push is as simple as running git config --global push.followTags true.

To use your new pipeline, whenever you want to create a release run:

  • yarn patch for patch releases
  • yarn minor for minor releases
  • yarn major for major releases

If you’re unsure about what the words “major,” “minor,” and “patch” mean, read more about this at the semantic versioning site.

Now that you’ve finally completed your pipeline, let’s recap how to use it!

  • Write code.
  • Commit and push it to test and deploy it to staging.
  • Use yarn patch to tag a patch release.
  • git push to push it out to production.

Summary and Further Steps

I’ve only just scratched the surface of what’s possible with CI/CD pipelines. This is a fairly simplistic example. You can do so much more by swapping out Heroku with Kubernetes. If you decide to use GitLab CI read the yaml docs because there’s so much more you can do by caching files between deploys, or saving artifacts!

Another huge change you could make to this pipeline is introduce external triggers to run the semantic versioning and releasing. Currently, ChatOps is part of their paid plan, and I hope they release it to free plans. But imagine being able to trigger the next image through a single Slack command!

Diagram of a CI/CD deployment pipeline where production deploys are triggered externally, possibly via chat or webhooks

Eventually, as your application starts to grow complex and requires system level dependencies, you may need to use a container. When that happens, check out our guide: Getting Started with Docker: Simplifying Devops .

This example app really is live, and you can find the source code for it here.

This post was written by Jonathan Bethune, Python developer for Toptal.

Those of us who are old enough can remember a day when software was delivered primarily by physical media. The spread of broadband internet and smartphones has led us to the age of the web service software hosted in the cloud accessed by user clients such as browsers and apps.

Not too long ago, web applications were run directly on physical machines in private data centers. For ease of management, these applications were usually monolithic—a single large server would contain all of the back-end code and database. Now, web hosting services like Amazon and the spread of hypervisor technology have changed all of that. Thanks to Amazon Web Services (AWS) and tools like VirtualBox, it has become easy to package an entire OS in a single file.

Using services like EC2, it has become easy to package machine images and string together sets of virtual servers. Along came the microservices paradigm—an approach to software architecture wherein large monolithic apps are broken up into smaller focused services that do one thing well. In general, this approach allows for easier scaling and feature development as bottlenecks are quicker to find and system changes easier to isolate.

Pets to Livestock

I became an infrastructure engineer right at the height of this trend. I recall building my first production environment in Amazon using a series of bash scripts. The servers were like pets to me. I gave each of them cute names. I monitored them carefully. I responded to alerts quickly and kept them healthy. I treated those instances with love and affection because it was painful to try to replace them—much like a beloved pet.

Along came Chef, a configuration management tool, and almost immediately my life got easier. With tools like Chef and Puppet, you can take away most of the manual pain associated with managing a cloud system. You can use its “environments” construct to separate development, staging, and production servers. You can use its “data bags” and “roles” to define configuration parameters and push sets of changes. Now, all of my “pet” servers had graduated from obedience school.

A graphic representation of a crane managing shipping containers

Then in 2013, along came Docker, and a new era began: the age of software as livestock (apologies to any vegans in the audience). The container paradigm is one of orchestration, not configuration management. Tools like Kubernetes, Docker Compose, and Marathon focus on moving around predefined images instead of adjusting config values on running instances. Infrastructure is immutable; when a container goes bad, we don’t try to fix it—we shoot it in the head and replace it. We care more about the health of the herd than individual animals. We don’t give our servers cute names anymore.

The Rewards

Containers make a lot of things easier. They let businesses focus more on their own special sauce. Tech teams can worry less about infrastructure and configuration management and instead worry mostly about app code. Companies can go a step further and use managed services for things like MySQL, Cassandra, Kafka, or Redis so as not to have to deal with the data layer at all. There are several startups offering “plug and play” machine learning services as well to allow companies to do sophisticated analytics without worrying about the infrastructure. These trends have culminated in the serverless model—a software architecture approach that allows teams to release software without managing a single VM or container. AWS services like S3, Lambda, Kinesis, and Dynamo make this possible. So to extend the analogy, we have gone from pets to livestock to some sort of on-demand animal service.

All of this is very cool. It is crazy that we live in a time where a twelve-year-old kid can spin up a sophisticated software system with a few clicks. We should remember that, not very long ago, this was impossible. Just a few US presidents ago, physical media was the standard and only big companies had the means to manufacture and distribute software. Bug fixes were a luxury. Now, that twelve-year-old kid can create an AWS account and make his software available to the entire world. If there’s a bug, someone will bug him on Slack and, in a few minutes, a fix is out for all users.

The Risks

Very, very cool, but not without its price—reliance on cloud providers like Amazon means reliance on big corporations and proprietary technologies. If Richard Stallman and Edward Snowden haven’t made you worry about such things, the recent debacle with Facebook certainly should have.

Greater abstraction away from hardware also brings with it the risk of less transparency and control. When something breaks in a system running hundreds of containers, we have to hope that the failure bubbles up somewhere we can detect. If the problem is with the host operating system or underlying hardware, it might be hard to determine. An outage that could have been resolved in 20 minutes using VMs may take hours or days to resolve with containers if you do not have the right instrumentation.

It isn’t just failures either that we need to worry about when it comes to things like Docker. There is also the problem of security. Whatever container platform we use, we have to trust that there are no backdoors or undisclosed security vulnerabilities. Using open-source platforms is no guarantee of safety either. If we rely on third-party container images for parts of our system, we may be vulnerable.

Wrap Up

The livestock paradigm is attractive for a number of reasons, but it is not without its downsides. Before rushing to containerize the entire stack, tech teams need to think about whether or not it is the right choice and ensure they can mitigate the negative effects.

Personally, I love working with containers. I’m excited to see where things go in the next ten years as new platforms and paradigms arise. However, as a former security consultant, I am cautious enough to know that everything comes with a price. It is up to engineers to remain vigilant to ensure that we don’t give up our autonomy as users and developers. Even the easiest CD/CI workflow in the world would not be worth the cost.

This post was written by Phillip Brennan, Software Developer for Toptal.

Regardless of the apparent evidence to the contrary, programmers are humans. And, as all humans, we like taking advantage over our freedom of choice. Whether that choice is about taking the red pill or the blue pill, wearing a dress or pants, or using one development environment over another, the choice we make places us in one group of people or another. Choice, inevitably, comes after our evaluation of options. And having made a choice, we tend to believe that anyone who chooses differently made a mistake.

You can easily search the internet and find hundreds of debates about Emacs vs Vim. Even if you read them all, it will be impossible to objectively choose a winner. However, does the choice of development environment tell you anything about the quality of work a developer can deliver? Absolutely not!

great developer could write her code into Notepad and still deliver great stuff.

Certainly, there are a lot of things professionals consider when selecting tools for their work. This is true for every profession, including software development. Quite often, however, selection is based on personal taste, not something easily tangible.

Programmers spend most of their time looking at the development environment, so it is natural that we want something pretty as well as functional. Every development environment has its pros and cons. As a whole, they a driving force of the software development industry.

best programming editors

What are the things a developer should evaluate when choosing a set of programming tools like a programming editor of choice? The answer to this question is not as simple as it might sound. Software development is close to an art, and there are quite few “fuzzy” factors that separate a masterpiece from an overpriced collectable.

Every programming language, be it Java, C#, PHP, Python, Ruby, JavaScript, and so on, has its own development practices related to project structure, debugging, and deploying. However, one thing they all have in common is editing code. In this article we will evaluate different development platforms from the perspective of the most common task in software development: writing code.

IDE vs General Purpose Text Editor

An integrated development environment (IDE) (or interactive development environment) is a software application that provides comprehensive facilities to computer programmers for software development. An IDE normally consists of a source code editor, build automation tools, and a debugger, and many support lots of additional plugins and extensions.

Text editors are simpler applications. Compared to IDEs, they usually correspond to just the code editor segment of an IDE. However, they are often much more than that. IDEs are created to serve the purpose of software development, while many text editors are designed to be used by non-developers as well.

Static-typed languages can get a lot of benefits from IDEs. Because of the strict typing rules, it is possible for the IDE to detect bugs and naming inconsistencies across classes and modules, and even across files, directly in the editor, before compiling. This functionality comes standard with many IDEs, and for that reason, IDEs are very popular for static-typed languages.

However, it is impossible to do the same thing for dynamically typed languages. For example, if a method name may be generated by the code itself, constructed from a series of string concats, trying to detect naming errors in dynamic languages requires nothing less than running the actual program. Because one of the major benefits of IDEs does not apply to dynamic language programmers, they have a greater tendency to stick with text editors like Sublime. As a side note, this is also a major reason why the test-driven development movement has grown up around dynamic language communities, and has not had as strong of a following in static languages.

What Makes a Great Programming Editor?

Aside from a number of different features for various languages, every programming editor needs to have a well-organized and clean user interface. Overall aesthetic appeal should not be overlooked, either. It is not just a matter of looking good, as a well-designed editor with the right choice of font and colors helps keep eyestrain down and lets you be more productive.

In today’s development environment, a steep learning curve is a liability, regardless of feature set. Time is always valuable, so a good editor should be easy to get used to. Ideally, the programmer should be able to start work immediately, without having to jump through too many hoops. A Swiss army knife is a practical and useful tool, yet anyone can master it in minutes. Likewise, for programming editors, simplicity is a virtue.

User Interface, Features, and Workflow

Let’s take a closer look at UI, different features and capabilities, and frequently used tools that should be a part of any programming editor.

Line numbers, of course, should be on by default and simple to turn on or off.

Snippets are useful for inserting standardized blocks of text in a fixed layout. However, programming is a lot about saying things just once, so be careful with snippets as they might make your code hard to maintain in the future.

The ability to lint, or syntax-check, the current file is useful, as is the ability to launch it. Without this facility, a programmer must switch to an external command line window, choose and run the correct command, and then step through error messages to find the source of the error. However, the linting must be under the programmer’s control, because the delay incurred by the lint might interrupt the coder at a crucial moment.

inline doc

Inline doc is useful as long as it does not get in the way, but having a browser page open on the class definitions is sometimes more useful, especially when there are lots of related classes that do not directly extend each other. It is easy enough to cut and paste code from the browser documentation to the code being written, so the additional complexity of inline documentation often becomes less useful, indeed, more annoying, as the programmer’s knowledge of the documentation increases.

Word-completion is helpful since it is fast, and almost as reliable as in-edit documentation, while being less intrusive. It is satisfying to enter just a few characters of a word and then hit enter to get the rest. Otherwise, one labors under the strain of excess typing, abhorred by lazy programmers, who want to type ee rather than the more lengthy exponentialFunctionSquared. Word completion satisfies by minimizing typing, enforcing coherent naming and by not getting in the way.

Renaming variables and functions across the program is useful, but you need to be able to review changes and make sure your code is not broken. Again, word completion is a useful halfway house, in that it works for all languages; you can use long names for items that have long lifetimes, without incurring a typing overhead. You can use references to them via a shorter name locally, in order to shorten expressions which might otherwise spread over too many lines. If you need to rename, the long names are unique, so this approach works across all languages, and all files.

Source files can sometimes grow a lot. Code-folding is a nice feature that simplifies reading through long files.

Find/change with scope limitation to local, incremental, or global with meta characters and regular expressions are part of the minimum requirement these days, as is syntax highlighting.

Over the years, I went through a number of editors, and this is what I think of them:

  • Emacs: One of the most popular editors in the world. Emacs’ greatest feature is its extensibility, despite the complexity of its extension language (you can even play Tetris in it with M-x tetris). Emacs fans consider its terminal-based interface to be a great feature, while others might debate that it’s a drawback. In my personal experience, I found it too much to adopt and learn. I am sure that if you know how to use Emacs you will never use anything else, but to take on and learn the entire culture was more than I wanted to do. Nevertheless, its popularity among developers proves that it is far from being a relic of the old times, and remains part of our future as well.
  • Vi/Vim: Vim is another powerful terminal-based editor, and it comes standard with most xNIX operating systems. Apart from having a different interface than Emacs, my view is practically the same. If you grew up on it, I am sure you will never use anything else. Having Vi skills will make your life much simpler when operating through SSH and other tight spots, and you wont have problems with speed, once you get familiar with keystrokes. While not as tough to crack into as Emacs, the learning curve is still quite steep, and it could definitely use few nice features of a windowed editor.
  • SublimeText: True to its name, SublimeText is a beautiful text editor with tons of features. Unlike some similar editors, SublimeText is closed source, so it cannot be modified at a low level. SublimeText offers the simplicity of traditional text editors, with a lean and fast UI. Many developers find it easier to use than Vim, and this is especially true of newcomers. The learning curve just isn’t as steep. While the UI is minimal and straightforward, SublimeText does offer a few nifty features, such as a scaled down display code on the right of the UI, allowing users to quickly scroll through their code and navigate with relative ease. While it’s not completely free, the feature-limited demo version is. Unlocking all the features will cost you $70.
  • Atom is the result of a GitHub effort to produce a programming editor for a new generation of developers. While it is still a work in progress, Atom is a very capable editor with a vibrant community of developers keen on new extensions, JavaScript libraries and more. It’s downsides include some UI quirks, the possibility that some add-on packages could misbehave, and reported performance issues when working with (very) big files. But the project is under active development, and current shortcomings are likely to be improved. Atom is an open source project, and it can easily be hacked to suit your needs.
  • Nano: Excellent in a tight corner, but not feature-rich enough to prevent the inevitable thought creeping into one’s mind that there must be faster way to do this as one struggles through the keystrokes to indent a block of code, while keeping the comments lined up in column 80! It does not even have text highlighting, and should not be used for anything more than config file changes.
  • TextMate2: TextMate’s biggest drawback is that it only runs on Mac. As its creators put it, “TextMate brings Apple’s approach to operating systems into the world of text editors.” By bridging UNIX underpinnings and GUI, TextMate cherry-picks the best of both worlds, to the benefit of expert scripters and novice users alike. It is the editor of choice for many Ruby, Python, and JavaScript developers, with great support for Bash or Markdown as well. At the moment of publishing this article TextMate 2 is still in Beta, but it already has a very mature plugin ecosystem that promises to extend it even beyond Emacs’s extensions.
  • jEdit: Java-based, and considered slow by some. Out of the box configuration might push certain people away, but jEdit can be extremely fast if configured properly, as well as extremely nice looking.
  • Eclipse: Another widely used IDE, Eclipse is very popular among Java developers, but has been adapted to many different platforms. We could argue that its monolithic architecture is a rock that will pull it under the water, but it is still one of the most popular platforms among developers.
  • Aptana Studio: A comprehensive open-source web application IDE. It is available as an Eclipse plugin, which makes it popular among some Java developers. The standalone version is even leaner, and offers a range of different themes and customization options. Aptana’s project management capabilities may also come in handy to coders who honed their skills in Eclipse. While earlier versions suffered from performance issues on some hardware platforms, these problems were addressed in Aptana Studio 3, and should be a thing of the past.
  • NetBeans: Another relatively popular open-source IDE with cross-platform support. It is somewhat slower on startup than lean editors like SublimeText, and the choice of add-ons is limited compared to some alternatives. Many Java developers have grown to love NetBeans thanks to seamless SCM integration and HTML5 support. NetBeans support for PHP has also improved in the latest releases.
  • JetBrains: Offers a family of IDEs for Java, Ruby, Python and PHP. They are all based on the same core engine. Very capable in its own right, JetBrains IDEs have been gaining a growing following. However, they are not free, open-source solutions, although a 30-day trial is available, and pricing is reasonable.
  • Komodo Edit: Komodo Edit has great potential, and yet it’s full of annoying little “gotchas” and idiosyncrasies that can be frustrating by its lack of orthogonality. Komodo Edit feels cluttered, which is a shame because it clearly has immense capability. I keep going back to Komodo Edit in the hopes that I have missed some organizing principle, and every time, I am beaten back by a welter of disorganized capability.
  • Geany: Geany is not a major power player like many of the other editors in this list. It is defined more by “what it is not” than “what it is.” It is not slow, it does not have a lot of heritage from the old days, it does not have a macro capability, or much of a multi window on buffer capability. Yet the things it does do, it does well enough. It is, perhaps, the least demanding of all the editors that I tried and still can do 90 percent of what you would expect from a programmer’s editor. Geany looks good enough on Ubuntu, which is one of the reason I chose it as my preferred editor.

My Conclusion

It would be presumptuous to declare just one as the best programming editor among these great tools. And there are quite a few editors I did not even try. There is no one-size-fits-all solution. This is what compelled me to try out a number of different editors.

I currently am using Geany, but it’s because it fits the requirements I have. With Geany, and a lot of help from Perl/Gimp/Audacity/Sox, I am able to develop and maintain the Java code-base for the Android apps I develop, prepare them for compilation in different configurations for multiple distributors, source, lint, compile, dex and produce .apk files, and deliver these apps globally.

Your line of development might set a different set of requirements, and I hope I saved you some time in researching for the most appropriate programming editors.

This post was written by Ben Jones, Front-end Developer for Toptal.

Ben is a skilled developer who always keeps the user in mind, which allows him to see how beneficial or detrimental a development process can be. He wants to create simple but effective software to reduce workload on all sides and to make employees, employers, and customers happy.


JavaScript frameworks/libraries such as Vue can offer a fantastic user experience when browsing your site. Most offer a way of dynamically changing page content without having to send a request to the server each time.

However, there is an issue with this approach. When initially loading your website, your browser doesn’t receive a complete page to display. Instead, it gets sent a bunch of pieces to construct the page (HTML, CSS, other files) and instructions for how to put them all together (a JavaScript framework/library) It takes a measurable amount of time to put all this information together before your browser actually has something to display. It’s like being sent a bunch of books along with a flat-pack bookcase. You’d have to build the bookcase first and then fill it with the books.

The solution to this is clever: Have a version of the framework/library on the server that can build a ready-to-display page. Then send this complete page to the browser along with the ability to make further changes and still have dynamic page content (the framework/library), just like being sent a ready-made bookcase along with some books. Sure, you still have to put the books in the bookcase, but you’ve got something usable immediately.

Visual comparison of client-side and server-side rendering

Beyond the silly analogy, there are also a bunch of other advantages. For example, a page that rarely changes, such as an About Us page, doesn’t need to be recreated every single time a user asks for it. So a server can create it once and then cache it or store it somewhere for future use. These kinds of speed improvements may seem tiny, but in an environment where time until responsiveness is measured in milliseconds (or less), every little bit counts.

If you’d like more information on the advantages of SSR in a Vue environment, you should check out Vue’s own article on SSR. There is a variety of options to achieve these results, but the most popular one, which is also recommended by the Vue team, is Nuxt.

Why Nuxt.js

Nuxt.js is based off an implementation of SSR for the popular React library called Next. After seeing the advantages of this design, a similar implementation was designed for Vue called Nuxt. Those familiar with the React+Next combination will spot a bunch of similarities in design and layout of the application. However, Nuxt offers Vue-specific features to create a powerful yet flexible SSR solution for Vue.

Nuxt was updated to a production-ready 1.0 version in January 2018 and is part of an active and well-supported community. One of the great things is that building a project using Nuxt isn’t that different from building any other Vue project. In fact, it provides a bunch of features that allow you to create well-structured codebases in a reduced amount of time.

Another important thing to note is Nuxt doesn’t have to be used for SSR. It’s promoted as a framework for creating universal Vue.js applications and includes a command (nuxt generate) for creating static generated Vue applications using the same codebase. So if you’re apprehensive about diving deep into SSR, don’t panic. You can always create a static site instead while still taking advantage of Nuxt’s features.

In order to grasp the potential of Nuxt, let’s create a simple project. The final source code for this project is hosted on GitHub if you want to see it, or you can view a live version created using nuxt generate and hosted on Netlify.

Creating a Nuxt Project

To start off, let’s use a Vue project generator called vue-cli to quickly create a sample project:

# install vue-cli globally
npm install -g vue-cli

# create a project using a nuxt template
vue init nuxt-community/starter-template my-nuxt-project

After going through a couple options, this will create a project inside the folder my-nuxt-project or whatever you specified. Then we just need to install dependencies and run the server:

cd my-nuxt-project
npm install # Or yarn
npm run dev

There we go. Open your browser to localhost:3000 and your project should be running. Not much different from creating a Vue Webpack project. However, when we look at the actual structure of the app, there’s not much there, especially when compared to something like the Vue Webpack template.

Diagram of project directories and their relation to the Nuxt config file

Looking in the package.json also shows that we only have one dependency, Nuxt itself. This is because each version of Nuxt is tailored to work with specific versions of Vue, Vue-router, and Vuex and bundles them all together for you.

There is also a nuxt.config.js file at the project root. This allows you to customize a bunch of features that Nuxt provides. By default, it sets the header tags, loading bar color, and ESLint rules for you. If you’re eager to see what you can configure, here’s the documentation; we will be covering some options in this article.

So what’s so special about those directories?

Project Layout

If you browse through the directories created, all of them have an accompanying Readme stating a brief summary of what goes in that directory and often a link to the docs.

This is one benefit of using Nuxt: a default structure for your application. Any good front-end developer will structure an application similar to this, but there are many different ideas about structures, and when working on a team, some time will inevitably go into discussing or choosing this structure. Nuxt provides one for you.

Nuxt will look for certain directories and build your application for you depending on what it finds. Let’s examine these directories one by one.

Pages

This is the only required directory. Any Vue components in this directory are automatically added to vue-router based on their filenames and the directory structure. This is extremely convenient. Normally I would have a separate Pages directory anyway and have to manually register each of those components in another router file. This router file can become complex for larger projects and may need splitting to maintain readability. Instead, Nuxt will handle all of this logic for you.

To demonstrate, we can create a Vue component called about.vue inside the Pages directory. Let’s just add a simple template such as:

<template>
 <h1>About Page</h1>
</template>

When you save, Nuxt will re-generate the routes for you. Seeing as we called our component about.vue, if you navigate to /about, you should see that component. Simple.

There is one filename which is special. Naming a file index.vue will create a root route for that directory. When the project is generated, there’s already an index.vue component in the pages directory which correlates to the homepage or landing page of your site. (In the development example, this would simply be localhost:3000.)

Nuxt scans the Vue files in the pages directory and outputs the appropriate pages.

What about deeper routes? Sub-directories in the Pages directory help to structure your routes. So if we wanted a View Product page, we could structure our Pages directory something like this:

/pages
--| /products
----| index.vue
----| view.vue

Now, if we navigate to /products/view, we will see the view.vue component inside the products directory. If we navigate instead to /products, we will see the index.vuecomponent inside the products directory.

You may be asking why we didn’t just create a products.vue component in the pages directory instead like we did for the /about page. You may think the result would be the same, but there is a difference between the two structures. Let’s demonstrate this by adding another new page.

Say we wanted a seperate About page for each employee. For example, let’s create an About page for me. It should be located at /about/ben-jones. Initially, we may try structuring the Pages directory like this:

/pages
--| about.vue
--| /about
----| ben-jones.vue

When we try to access /about/ben-jones, we instead get the about.vue component, the same as /about. What’s going on here?

Interestingly, what Nuxt is doing here is generating a nested route. This structure suggests that you want a permanent /about route and anything inside that route should be nested in its own view area. In vue-router, this would be signified by specifying a <router-view />component inside the about.vue component. In Nuxt, this is the same concept except, instead of <router-view />, we simply use <nuxt />. So let’s update our about.vuecomponent to allow for nested routes:

<template>
 <div>
   <h1>About Page</h1>
   <nuxt />
 </div>
</template>

Now, when we navigate to /about, we get the about.vue component we had before, with just a title. However, when we navigate to /about/ben-jones, we instead have the title andthe ben-jones.vue component rendered where the <nuxt/> placeholder was.

This wasn’t what we initially wanted, but the idea of having an About page with a list of people that, when clicked on, fill a section on the page with their information is an interesting concept, so let’s leave it as is for now. If you did want the other option, then all we would do is restructure our directories. We’d just have to move the about.vuecomponent inside the /about directory and rename it index.vue, so the resulting structure would be:

/pages
--| /about
----| index.vue
----| ben-jones.vue

Finally, say we wanted to use route params to retrieve a specific product. For example, we want to be able to edit a product by navigating to /products/edit/64 where 64 is the product_id. We can do this the following way:

/pages
--| /products
----| /edit
------| _product_id.vue

Note the underscore at the beginning of the _product_id.vue component—this signifies a route param which is then accessible on the $route.params object or on the params object in Nuxt’s Context (more on that later). Note that the key for the param will be the component name without the initial underscore—in this case, product_id—so try to keep them unique within the project. As a result, in _product_id.vue, we may have something like:

<template>
 <h1>Editing Product {{ $route.params.product_id }}</h1>
</template>

You can start to imagine more complex layouts, which would be a pain to set up using vue-router. For example, we can combine all of the above into a route such as:

/pages
--| /categories
----| /_category_id
------| products.vue
------| /products
--------| _product_id.vue

It’s not too difficult to reason on what /categories/2/products/3 would display. We would have the products.vue component with a nested _product_id.vue component, with two route params: category_id and product_id. This is much simpler to reason on than an equivalent router config.

While we’re on the topic, one thing I tend to do in the router config is set up router guards. As Nuxt is building the router for us, this can be done instead on the component itself with beforeRouteEnter. If you want to validate route params, Nuxt provides a component method called validate. So if you wanted to check if the product_id was a number before trying to render the component, you would add the following to the script tag of _product_id.vue:

export default {
 validate ({ params }) {
   // Must be a number
   return /^\d+$/.test(params.product_id)
 }
}

Now, navigating to /categories/2/products/someproduct results in a 404 because someproduct isn’t a valid number.

That’s it for the Pages directory. Learning how to structure your routes properly in this directory is essential, so spending a little time initially is important to getting the most out of Nuxt. If you’re looking for a brief overview, it is always helpful to refer to the docs for routing.

If you’re worried about not being in control of the router, don’t be. This default setup works great for a wide variety of projects, provided they are well structured. However, there are some cases where you may need to add more routes to the router than Nuxt automatically generates for you or restructure them. Nuxt provides a way to customize the router instance in the config, allowing you to add new routes and customize generated routes. You can also edit the core functionality of the router instance, including extra options added by Nuxt. So if you do encounter an edge case, you still have the flexibility to find the appropriate solution.

Store

Nuxt can build your Vuex store based on the structure of the store directory, similar to the Pages directory. If you don’t need a store, just remove the directory. There are two modes for the store, Classic and Modules.

Classic requires you to have an index.js file in the store directory. There you need to export a function that returns a Vuex instance:

import Vuex from 'vuex'

const createStore = () => {
 return new Vuex.Store({
   state: ...,
   mutations: ...,
   actions: ...
 })
}

export default createStore

This allows you to create the store however you wish, much like using Vuex in a normal Vue project.

Modules mode also requires you to create an index.js file in the store directory. However, this file only needs to export the root state/mutations/actions for your Vuex store. The example below specifies a blank root state:

export const state = () => ({})

Then, each file in the store directory will be added to the store in its own namespace or module. For example, let’s create somewhere to store the current product. If we create a file called product.js in the store directory, then a name-spaced section of the store will be available at $store.product. Here’s a simple example of what that file may look like:

export const state = () => ({
 _id: 0,
 title: 'Unknown',
 price: 0
})

export const actions = {
 load ({ commit }) {
   setTimeout(
     commit,
     1000,
     'update',
     { _id: 1, title: 'Product', price: 99.99 }
   )
 }
}

export const mutations = {
 update (state, product) {
   Object.assign(state, product)
 }
}

The setTimeout in the load action simulates some sort of API call, which will update the store with the response; in this case, it takes one second. Now, let’s use it in the products/view page:

<template>
 <div>
   <h1>View Product {{ product._id }}</h1>
   <p>{{ product.title }}</p>
   <p>Price: {{ product.price }}</p>
 </div>
</template>

<script>
import { mapState } from 'vuex'
export default {
 created () {
   this.$store.dispatch('product/load')
 },
 computed: {
   ...mapState(['product'])
 }
}
</script>

A few things to note: Here, we are calling our fake API when the component is created. You can see that the product/load action we are dispatching is namespaced under Product. This makes it clear exactly what section of the store we are dealing with. Then, by mapping the state to a local computed property, we can easily use it in our template.

There is a problem: We see the original state for a second while the API runs. Later, we will use a solution provided by Nuxt to fix this (known as fetch).

Just to stress this again, we never had to npm install vuex, as it is already included in the Nuxt package. When you add an index.js file to the store directory, all those methods are then opened up to you automatically.

That’s the main two directories explained; the rest are much simpler.

Components

The Components directory is there to contain your reusable components such as a navigation bar, image gallery, pagination, data tables, etc. Seeing as components in the Pages directory are converted into routes, you need somewhere else to store these types of components. These components are accessible in pages or other components by importing them:

import ComponentName from ~/components/ComponentName.vue

Assets

This contains uncompiled assets and has more to do with how Webpack loads and processes files, rather than with how Nuxt works. If you’re interested, I suggest reading the guide in the Readme.

Static

This contains static files which are mapped to the root directory of your site. For example, putting an image called logo.png in this directory would make it available at /logo.png. This is good for meta files like robots.txt, favicon.ico, and other files you need available.

Layouts

Normally, in a Vue project, you have some sort of root component, normally called App.vue. Here is where you can set up your (normally static) app layout, which may include a navbar, footer, and then a content area for your vue-router. The default layout does exactly that and is provided for you in the layouts folder. Initially, all it has is a div with a <nuxt />component (which is equivalent to <router-view />) but it can be styled however you wish. For example, I’ve added a simple navbar to the example project for navigation around the various demonstration pages.

A layout can be applied to multiple pages.

You may want to have a different layout for a certain section of your app. Maybe you have some sort of CMS or admin panel that looks different. To solve this, create a new layout in the Layouts directory. As an example, let’s create an admin-layout.vue layout which just has an extra header tag and no navbar:

<template>
 <div>
   <h1>Admin Layout</h1>
   <nuxt />
 </div>
</template>

Then, we can create an admin.vue page in the Pages directory and use a property provided by Nuxt called layout to specify the name (as a string) of the layout we want to use for that component:

<template>
 <h1>Admin Page</h1>
</template>

<script>
export default {
 layout: 'admin-layout'
}
</script>

That’s all there is to it. Page components will use the default layout unless specified, but when you navigate to /admin, it now uses the admin-layout.vue layout. Of course, this layout could be shared across several admin screens if you wish. The one important thing to remember is layouts must contain a <nuxt /> element.

There’s one final thing to note about layouts. You may have noticed while experimenting that if you type an invalid URL, you are shown an error page. This error page is, in fact, another layout. Nuxt has its own error layout (source code here), but if you wanted to edit it, just create an error.vue layout and that will be used instead. The caveat here is that the error layout must not have a <nuxt /> element. You will also have access to an errorobject on the component with some basic information to display. (This is printed out in the terminal running Nuxt if you want to examine it.)

Middleware

Middleware are functions that can be executed before rendering a page or layout. There is a variety of reasons you may want to do so. Route guarding is a popular use where you could check the Vuex store for a valid login or validate some params (instead of using the validate method on the component itself). One project I worked on recently used middleware to generate dynamic breadcrumbs based on the route and params.

These functions can be asynchronous; just be careful, as nothing will be shown to the user until the middleware is resolved. They also have access to Nuxt’s Context, which I will explain later.

Plugins

This directory allows you to register Vue plugins before the application is created. This allows the plugin to be shared throughout your app on the Vue instance rather than creating it each time you import it into a component. This helps to keep the bundle sizes small.

Most major plugins have a Nuxt version that can be easily registered to the Vue instance by following their docs. However, there will be circumstances when you will be developing a plugin or need to adapt an existing plugin for this purpose. An example I’m borrowing from the docs shows how to do this for vue-notifications. First, we need to install the package:

npm install vue-notifications --save

Then create a file in the plugins directory called vue-notifications.js and include the following:

import Vue from 'vue'
import VueNotifications from 'vue-notifications'

Vue.use(VueNotifications)

Very similar to how you would register a plugin in a normal Vue environment. Then edit the nuxt.config.js file at your project root and add the following entry to the module.exports object:

plugins: ['~/plugins/vue-notifications']

That’s it. Now you can use vue-notifications throughout your app without increasing the bundle size. An example of this is at /plugin in the example project.

So that completes a rundown of the directory structure. It may seem a lot to learn, but if you’re developing a Vue app, you’re already setting up the same kind of logic. Nuxt helps to abstract away the setup and help you focus on building.

Nuxt does more than assist in development though. It supercharges your components by providing extra functionality.

Nuxt’s Supercharged Components

When I first started researching Nuxt, I kept reading about how Page components are supercharged. It sounded great, but it wasn’t immediately obvious what exactly that meant and what benefits it brings.

What it means is that all Page components have extra methods attached to them that Nuxt can use to provide additional functionality. In fact, we’ve already seen one of these earlier when we used the validate method to check params and redirect a user if they are invalid.

The two main ones used in a Nuxt project will be the asyncData and fetch methods. Both are very similar in concept, they are run asynchronously before the component is generated, and they can be used to populate the data of a component and the store. They also enable the page to be fully rendered on the server before sending it to the client even when we have to wait for some database or API call.

What’s the difference between asyncData and fetch?

  • asyncData is used to populate the data of the Page component. When you return an object, it is then merged with the output of data before rendering.
  • fetch is used to populate the Vuex Store. If you return a promise, Nuxt will wait until it is resolved before rendering.

So let’s put these to good use. Remember earlier on the /products/view page we had a problem where the initial state of the store was being displayed briefly while our fake API call was being made? One way of fixing this is having a boolean stored on the component or in the Store such as loading = true and then displaying a loading component while the API call finishes. Afterward, we would set loading = false and display the data.

Instead, let’s use fetch to populate the Store before rendering. In a new page called /products/view-async, let’s change the created method to fetch; that should work, right?

export default {
 fetch () {
   // Unfortunately the below line throws an error
   // because 'this.$store' is undefined...
   this.$store.dispatch('product/load')
 },
 computed: {...}
}

Here’s the catch: These “supercharged” methods run before the component is created, so this doesn’t point to the component and nothing on it can be accessed. So how do we access the Store here?

The Context API

Of course, there is a solution. On all of Nuxt’s methods, you are provided with an argument (normally the first) containing an extremely useful object called the Context. In this is everything you will need reference to across the app, meaning we don’t need to wait for Vue to create those references on the component first.

I would highly recommend checking out the Context docs to see what is available. Some handy ones are app, where you can access all your plugins, redirect, which can be used to change routes, error to display the error page, and some self-explanatory ones such as routequery, and store.

So, to access the Store, we can destructure the Context and extract the Store from it. We also need to make sure we return a promise so that Nuxt can wait for it to resolve before rendering the component, so we need to make a small adjustment to our Store action too.

// Component
export default {
 fetch ({ store }) {
   return store.dispatch('product/load')
 },
 computed: {...}
}

// Store Action
load ({ commit }) {
 return new Promise(resolve => {
   setTimeout(() => {
     commit('update', { _id: 1, title: 'Product', price: 99.99 })
     resolve()
   }, 1000)
 })
}

You could use async/await or other methods depending on your coding style, but the concept is the same—we’re telling Nuxt to make sure the API call finishes and the Store is updated with the result before trying the render the component. If you try navigating to /products/view-async, you will not see the flash of content where the product is in its initial state.

You can imagine how useful this can be in any Vue app even without SSR. The Context is also available to all middlewares as well as to other Nuxt methods such as NuxtServerInitwhich is a special store action that runs before the Store is initialized (an example of this is in the next section)

Considerations When Using SSR

I’m sure many (myself included) who start using a technology such as Nuxt while treating it like any other Vue project eventually hit a wall where something we know would normally work seems impossible in Nuxt. As more of these caveats are documented, it will be easier to overcome, but the main thing to consider when starting to debug is that the client and server are two separate entities.

When you access a page initially, a request is sent to Nuxt, the server builds as much as possible of that page and the rest of the app, and then the server sends it to you. Then the responsibility is on the client to continue with navigation and load chunks as it needs them.

We want the server to do as much as possible first, but sometimes it doesn’t have access to the information it needs, which results in the work being done client-side instead. Or worse, when the final content presented by the client is different from what the server expected, the client is told to rebuild it from scratch. This is a big indication that something is wrong with the application logic. Thankfully, an error will be generated in your browser’s console (in development mode) if this starts to happen.

Let’s take an example of how to solve a common issue, session management. Imagine you have a Vue app where you can log in to an account, and your session is stored using a token (JWT, for example) which you decide to keep in localStorage. When you initially access the site, you want to authenticate that token against an API, which returns some basic user info if valid and puts that information in the Store.

After reading through Nuxt’s docs, you see that there’s a handy method called NuxtServerInit which allows you to asynchronously populate the Store once on initial load. That sounds perfect! So you create your user module in the Store and add the appropriate action in the index.js file in the Store directory:

export const actions = {
 nuxtServerInit ({ dispatch }) {
   // localStorage should work, right?
   const token = localStorage.getItem('token')
   if (token) return dispatch('user/load', token)
 }
}

When you refresh the page, you get an error, localStorage is not defined. Thinking about where this is happening, it makes sense. This method is run on the server, it has no idea what is stored in localStorage on the client; in fact, it doesn’t even know what “localStorage” is! So that’s not an option.

The server tries to execute localStorage.getItem('token') but throws an error, then a caption below explaining the problem.

So what’s the solution? There are a few, actually. You can get the client to initialize the Store instead but end up losing the benefits of SSR because the client ends up doing all the work. You can set up sessions on the server and then use that to authenticate the user, but that’s another layer to set up. What’s most similar to the localStorage method is using cookies instead.

Nuxt has access to cookies because they are sent with the request from the client to the server. As with other Nuxt methods, nuxtServerInit has access to the Context, this time as the second argument because the first is reserved for the store. On the Context, we can access the req object, which stores all the headers and other information from the client request. (This will be especially familiar if you’ve used Node.js.)

So after storing the token in a cookie instead (called “token,” in this case), let’s access it on the server.

import Cookie from 'cookie'

export const actions = {
 nuxtServerInit ({ dispatch }, { req }) {
   const cookies = Cookie.parse(req.headers.cookie || '')
   const token = cookies['token'] || ''
   if (token) return dispatch('user/load', token)
 }
}

A simple solution, but one that might not be immediately obvious. Learning to think about where certain actions are happening (client, server, or both) and what they have access to takes some time but the benefits are worth it.

Deployment

Deployment with Nuxt is extremely simple. Using the same codebase, you can create an SSR app, single-page application, or static page.

Server-side Rendered App (SSR App)

This is probably what you were aiming for when using Nuxt. The basic concept for deployment here is to run the build process on whatever platform you choose and set a few configurations. I’ll use the Heroku example from the docs:

First, set up scripts for Heroku in package.json:

"scripts": {
 "dev": "nuxt",
 "build": "nuxt build",
 "start": "nuxt start",
 "heroku-postbuild": "npm run build"
}

Then set up the Heroku environment using the heroku-cli (setup instructions here:

# set Heroku variables
heroku config:set NPM_CONFIG_PRODUCTION=false
heroku config:set HOST=0.0.0.0
heroku config:set NODE_ENV=production

# deploy
git push heroku master

That’s it. Now your SSR Vue app is live ready for the world to see. Other platforms have different setups, but the process is similar. The official deployment methods currently listed are:

Single-page Application (SPA)

If you wanted to take advantage of some of the extra features Nuxt provides but avoid the server trying to render pages, then you can deploy as an SPA instead.

First, it’s best to test your application without the SSR as by default npm run dev runs with SSR on. To change that, edit the nuxt.config.js file and add the following option:

mode: 'spa',

Now, when you run npm run dev, SSR will be turned off and the application will run as an SPA for you to test. This setting also makes sure no future builds will include SSR.

If everything looks fine, then deployment is exactly the same as for an SSR app. Just remember you need to set mode: 'spa' first to let the build process know you want an SPA.

Static Pages

If you don’t want to deal with a server at all and instead want to generate pages for use with static hosting services such as Surge or Netlify, then this is the option to choose. Just bear in mind that, without a server, you won’t be able to access the req and res in the Context, so if your code relies on that, be sure to accommodate it. For example, when generating the example project, the nuxtServerInit function throws an error because it’s trying to fetch a token from the cookies in the request headers. In this project, it doesn’t matter, as that data isn’t being used anywhere, but in a real application, there would need to be an alternative way to access that data.

Once that’s sorted, deployment is easy. One thing you will probably need to change first is adding an option so that the nuxt generate command will also create a fallback file. This file will prompt the hosting service to let Nuxt handle the routing rather than the hosting service, throwing a 404 error. To do so, add the following line to nuxt.config.js:

generate: { fallback: true },

Here’s an example using Netlify, which isn’t currently in the Nuxt docs. Just bear in mind that if this is your first time using netlify-cli, you will be prompted to authenticate:

# install netlify-cli globally
npm install netlify-cli -g

# generate the application (outputs to dist/ folder)
npm run generate

# deploy
netlify deploy dist

It’s as simple as that! As mentioned at the beginning of the article, there’s a version of this project here. There’s also official deployment documentation for the following services below:

Learn More

Nuxt is updating rapidly, and this is only a small selection of the features it offers. I hope this article encourages you to try it out and see if it could help improve the capabilities of your Vue applications, allowing you to develop faster and take advantage of its powerful features.

If you’re looking for more information, then look no further than Nuxt’s official links:

Looking to up your JavaScript game? Try reading The Comprehensive Guide to JavaScript Design Patterns by fellow Toptaler Marko Mišura.

This post was written by Federico Pereiro, JavaScript Developer for Toptal.

Declarative programming is, currently, the dominant paradigm of an extensive and diverse set of domains such as databases, templating and configuration management.

In a nutshell, declarative programming consists of instructing a program on what needs to be done, instead of telling it how to do it. In practice, this approach entails providing a domain-specific language (DSL) for expressing what the user wants, and shielding them from the low-level constructs (loops, conditionals, assignments) that materialize the desired end state.

While this paradigm is a remarkable improvement over the imperative approach that it replaced, I contend that declarative programming has significant limitations, limitations that I explore in this article. Moreover, I propose a dual approach that captures the benefits of declarative programming while superseding its limitations.

CAVEATThis article emerged as the result of a multi-year personal struggle with declarative tools. Many of the claims I present here are not thoroughly proven, and some are even presented at face value. A proper critique of declarative programming would take considerable time, effort, and I would have to go back and use many of these tools; my heart is not in such an undertaking. The purpose of this article is to share a few thoughts with you, pulling no punches, and showing what worked for me. If you’ve struggled with declarative programming tools, you might find respite and alternatives. And if you enjoy the paradigm and its tools, don’t take me too seriously.

If declarative programming works well for you, I’m in no position to tell you otherwise.

You can love or hate declarative programming, but you cannot afford to ignore it.
You can love or hate declarative programming, but you cannot afford to ignore it.

The Merits Of Declarative Programming

Before we explore the limits of declarative programming, it is necessary to understand its merits.

Arguably the most successful declarative programming tool is the relational database (RDB). It might even be the first declarative tool. In any case, RDBs exhibit the two properties that I consider archetypical of declarative programming:

  • A domain specific language (DSL): the universal interface for relational databases is a DSL named Structured Query Language, most commonly known as SQL.
  • The DSL hides the lower level layer from the user: ever since Edgar F. Codd’s original paper on RDBs, it is plain that the power of this model is to dissociate the desired queries from the underlying loops, indexes and access paths that implement them.

Before RDBs, most database systems were accessed through imperative code, which is heavily dependent on low-level details such as the order of records, indexes and the physical paths to the data itself. Because these elements change over time, code often stops working because of some underlying change in the structure of the data. The resulting code is hard to write, hard to debug, hard to read and hard to maintain. I’ll go out a limb and say that most of this code was in, all likelihood, long, full of proverbial rats’ nests of conditionals, repetition and subtle, state-dependent bugs.

In the face of this, RDBs provided a tremendous productivity leap for systems developers. Now, instead of thousands of lines of imperative code, you had a clearly defined data scheme, plus hundreds (or even just tens) of queries. As a result, applications had only to deal with an abstract, meaningful and lasting representation of data, and interface it through a powerful, yet simple query language. The RDB probably raised the productivity of programmers, and companies that employed them, by an order of magnitude.

What are the commonly listed advantages of declarative programming?

Proponents of declarative programming are quick to point out the advantages. However, even they admit it comes with trade-offs.
Proponents of declarative programming are quick to point out the advantages. However, even they admit it comes with trade-offs.
  1. Readability/usability: a DSL is usually closer to a natural language (like English) than to pseudocode, hence more readable and also easier to learn by non-programmers.
  2. Succinctness: much of the boilerplate is abstracted by the DSL, leaving less lines to do the same work.
  3. Reuse: it is easier to create code that can be used for different purposes; something that’s notoriously hard when using imperative constructs.
  4. Idempotence: you can work with end states and let the program figure it out for you. For example, through an upsert operation, you can either insert a row if it is not there, or modify it if it is already there, instead of writing code to deal with both cases.
  5. Error recovery: it is easy to specify a construct that will stop at the first error instead of having to add error listeners for every possible error. (If you’ve ever written three nested callbacks in node.js, you know what I mean.)
  6. Referential transparency: although this advantage is commonly associated with functional programming, it is actually valid for any approach that minimizes manual handling of state and relies on side effects.
  7. Commutativity: the possibility of expressing an end state without having to specify the actual order in which it will be implemented.

While the above are all commonly cited advantages of declarative programming, I would like to condense them into two qualities, which will serve as guiding principles when I propose an alternative approach.

  1. A high-level layer tailored to a specific domain: declarative programming creates a high-level layer using the information of the domain to which it applies. It is clear that if we’re dealing with databases, we want a set of operations for dealing with data. Most of the seven advantages above stem from the creation of a high-level layer that is precisely tailored to a specific problem domain.
  2. Poka-yoke (fool-proofness): a domain-tailored high-level layer hides the imperative details of the implementation. This means that you commit far fewer errors because the low-level details of the system are simply not accessible. This limitation eliminates many classes of errors from your code.

Two Problems With Declarative Programming

In the following two sections, I will present the two main problems of declarative programming: separateness and lack of unfolding. Every critique needs its bogeyman, so I will use HTML templating systems as a concrete example of the shortcomings of declarative programming.

The Problem With DSLs: Separateness

Imagine that you need to write a web application with a non-trivial number of views. Hard coding these views into a set of HTML files is not an option because many components of these pages change.

The most straightforward solution, which is to generate HTML by concatenating strings, seems so horrible that you will quickly look for an alternative. The standard solution is to use a template system. Although there are different types of template systems, we will sidestep their differences for the purpose of this analysis. We can consider all of them to be similar in that the main mission of template systems is to provide an alternative to code that concatenates HTML strings using conditionals and loops, much like RDBs emerged as an alternative to code that looped through data records.

Let’s suppose we go with a standard templating system; you will encounter three sources of friction, which I will list in ascending order of importance. The first is that the template necessarily resides in a file separate from your code. Because the templating system uses a DSL, the syntax is different, so it cannot be in the same file. In simple projects, where file counts are low, the need to keep separate template files may duplicate or treble the amount of files.

I open an exception for Embedded Ruby templates (ERB), because those are integrated into Ruby source code. This is not the case for ERB-inspired tools written in other languages since those templates must also be stored as different files.

The second source of friction is that the DSL has its own syntax, one different from that of your programming language. Hence, modifying the DSL (let alone writing your own) is considerably harder. To go under the hood and change the tool, you need to learn about tokenizing and parsing, which is interesting and challenging, but hard. I happen to see this as a disadvantage.

How can one vizualise a DSL? It’s not easy, but let’s just say a DSL is a clean, shiny layer on top of low-level constructs.
How can one vizualise a DSL? It’s not easy, but let’s just say a DSL is a clean, shiny layer on top of low-level constructs.

You may ask, “Why on earth would you want to modify your tool? If you are doing a standard project, a well-written standard tool should fit the bill.” Maybe yes, maybe no.

A DSL never has the full power of a programming language. If it did, it wouldn’t be a DSL anymore, but rather a full programming language.

But isn’t that the whole point of a DSL? To not have the full power of a programming language available, so that we can achieve abstraction and eliminate most sources of bugs? Maybe, yes. However, most DSLs start simple and then gradually incorporate a growing number of the facilities of a programming language until, in fact, it becomes one. Template systems are a perfect example. Let’s see the standard features of template systems and how they correlate to programming language facilities:

  • Replace text within a template: variable substitution.
  • Repetition of a template: loops.
  • Avoid printing a template if a condition is not met: conditionals.
  • Partials: subroutines.
  • Helpers: subroutines (the only difference with partials is that helpers can access the underlying programming language and let you out of the DSL straightjacket).

This argument, that a DSL is limited because it simultaneously covets and rejects the power of a programming language, is directly proportional to the extent that the features of the DSL are directly mappable to the features of a programming language. In the case of SQL, the argument is weak because most of the things SQL offers are nothing like what you find in a normal programming language. At the other end of the spectrum, we find template systems where virtually every feature is making the DSL converge towards BASIC.

Let’s now step back and contemplate these three quintessential sources of friction, summed up by the concept of separateness. Because it is separate, a DSL needs to be located on a separate file; it is harder to modify (and even harder to write your own), and (often, but not always) needs you to add, one by one, the features you miss from a real programming language.

Separateness is an inherent problem of any DSL, no matter how well designed.

We now turn to a second problem of declarative tools, which is widespread but not inherent.

Another Problem: Lack Of Unfolding Leads To Complexity

If I had written this article a few months ago, this section would have been named Most Declarative Tools Are #@!$#@! Complex But I Don’t Know Why. In the process of writing this article I found a better way of putting it: Most Declarative Tools Are Way More Complex Than They Need To Be. I will spend the rest of this section explaining why. To analyze the complexity of a tool, I propose a measure called the complexity gap. The complexity gap is the difference between solving a given problem with a tool versus solving it in the lower level (presumably, plain imperative code) that the tool intends to replace. When the former solution is more complex than the latter, we are in presence of the complexity gap. By more complex, I mean more lines of code, code that’s harder to read, harder to modify and harder to maintain, but not necessarily all of these at the same time.

Please note that we’re not comparing the lower level solution against the best possible tool, but rather against no tool. This echoes the medical principle of “First, do no harm”.

Signs of a tool with a large complexity gap are:

  • Something that takes a few minutes to describe in rich detail in imperative terms will take hours to code using the tool, even when you know how to use the tool.
  • You feel you are constantly working around the tool rather than with the tool.
  • You are struggling to solve a straightforward problem that squarely belongs in the domain of the tool you are using, but the best Stack Overflow answer you find describes a workaround.
  • When this very straightforward problem could be solved by a certain feature (which does not exist in the tool) and you see a Github issue in the library that features a long discussion of said feature with +1s interspersed.
  • A chronic, itching, longing to ditch the tool and do the whole thing yourself inside a _ for- loop_.

I might have fallen prey to emotion here since template systems are not that complex, but this comparatively small complexity gap is not a merit of their design, but rather because the domain of applicability is quite simple (remember, we’re just generating HTML here). Whenever the same approach is used for a more complex domain (such as configuration management) the complexity gap may quickly turn your project into a quagmire.

That said, it is not necessarily unacceptable for a tool to be somewhat more complex than the lower level it intends to replace; if the tool yields code that is more readable, concise and correct, it can be worth it t. It’s an issue when the tool is several times more complex than the problem it replaces; this is flat-out unacceptable. Brian Kernighan famously stated that, “Controlling complexity is the essence of computer programming.” If a tool adds significant complexity to your project, why even use it?

The question is, why are some declarative tools so much more complex than they need be? I think it would be a mistake to blame it on poor design. Such a general explanation, a blanket ad-hominem attack on the authors of these tools, is not fair. There has to be a more accurate and enlightening explanation.

Origami time! A tool with a high-level interface to an abstract lower level has to unfold the higher level from the lower one.
Origami time! Origami time! A tool with a high-level interface to an abstract lower level has to unfold the higher level from the lower one.

My contention is that any tool that offers a high level interface to abstract a lower level must unfold this higher level from the lower one. The concept of unfolding comes from Christopher Alexander’s magnum opus, The Nature of Order – in particular Volume II. It is (hopelessly) beyond the scope of this article (not to mention my understanding) to summarize the implications of this monumental work for software design; I believe its impact will be huge in years to come. It is also beyond this article to provide a rigorous definition of unfolding processes. I will use here the concept in a heuristic way.

An unfolding process is one that, in a stepwise fashion, creates further structure without negating the existing one. At every step, each change (or differentiation, to use Alexander’s term) remains in harmony with any previous structure, when previous structure is, simply, a crystallized sequence of past changes.

Interestingly enough, Unix is a great example of the unfolding of a higher level from a lower one. In Unix, two complex features of the operative system, batch jobs and coroutines (pipes), are simply extensions of basic commands. Because of certain fundamental design decisions, such as making everything a stream of bytes, the shell being a userland programand standard I/O files, Unix is able to provide these sophisticated features with minimal complexity.

To underline why these are excellent examples of unfolding, I would like to quote a few excerpts of a 1979 paper by Dennis Ritchie, one of the authors of Unix:

On batch jobs:

… the new process control scheme instantly rendered some very valuable features trivial to implement; for example detached processes (with &) and recursive use of the shell as a command. Most systems have to supply some sort of special batch job submissionfacility and a special command interpreter for files distinct from the one used interactively.

On coroutines:

The genius of the Unix pipeline is precisely that it is constructed from the very same commands used constantly in simplex fashion.

UNIX pioneers Dennis Ritchie and Ken Thompson created a powerful demonstration of unfolding in their OS. They also saved us from a dystopian all-Windows future.
UNIX pioneers Dennis Ritchie and Ken Thompson created a powerful demonstration of unfolding in their OS. They also saved us from a dystopian all-Windows future.

This elegance and simplicity, I argue, comes from an unfolding process. Batch jobs and coroutines are unfolded from previous structures (commands run in a userland shell). I believe that because of the minimalist philosophy and limited resources of the team that created Unix, the system evolved stepwise, and as such, was able to incorporate advanced features without turning its back on to the basic ones because there weren’t enough resources to do otherwise.

In the absence of an unfolding process, the high level will be considerably more complex than necessary. In other words, the complexity of most declarative tools stem from the fact that their high level does not unfold from the low level they intend to replace.

This lack of unfoldance, if you forgive the neologism, is routinely justified by the necessity to shield the user from the lower level. This emphasis on poka-yoke (protecting the user from low level errors) comes at the expense of a large complexity gap that is self-defeating because the extra complexity will generate new classes of errors. To add insult to injury, these classes of errors have nothing to do with the problem domain but rather with the tool itself. We would not go too far if we describe these errors as iatrogenic.

Declarative templating tools, at least when applied to the task of generating HTML views, are an archetypical case of a high level that turns its back on the low level it intends to replace. How so? Because generating any non-trivial view requires logic, and templating systems, especially logic-less ones, banish logic through the main door and then smuggle some of it back through the cat door.

Note: An even weaker justification for a large complexity gap is when a tool is marketed as magic, or something that just works, the opaqueness of the low level is supposed to be an asset because a magic tool is always supposed to work without you understanding why or how. In my experience, the more magical a tool purports to be, the faster it transmutes my enthusiasm into frustration.

But what about the separation of concerns? Shouldn’t view and logic remain separate? The core mistake, here, is to put business logic and presentation logic in the same bag. Business logic certainly has no place in a template, but presentation logic exists nevertheless. Excluding logic from templates pushes presentation logic into the server where it is awkwardly accommodated. I owe the clear formulation of this point to Alexei Boronine, who makes an excellent case for it in this article.

My feeling is that roughly two thirds of the work of a template resides in its presentation logic, while the other third deals with generic issues such as concatenating strings, closing tags, escaping special characters, and so on. This is the two-faced low level nature of generating HTML views. Templating systems deal appropriately with the second half, but they don’t fare well with the first. Logic-less templates flat out turn their back on this problem, forcing you to solve it awkwardly. Other template systems suffer because they truly need to provide a non-trivial programming language so their users can actually write presentation logic.

To sum up; declarative templating tools suffer because:

  • If they were to unfold from their problem domain, they would have to provide ways to generate logical patterns;
  • A DSL that provides logic is not really a DSL, but a programming language. Note that other domains, like configuration management, also suffer from lack of “unfoldance.”

I would like to close the critique with an argument that is logically disconnected from the thread of this article, but deeply resonates with its emotional core: We have limited time to learn. Life is short, and on top of that, we need to work. In the face of our limitations, we need to spend our time learning things that will be useful and withstand time, even in the face of fast changing technology. That is why I exhort you to use tools that don’t just provide a solution but actually shed a bright light on the domain of its own applicability. RDBs teach you about data, and Unix teaches you about OS concepts, but with unsatisfactory tools that don’t unfold, I’ve always felt I was learning the intricacies of a sub-optimal solution while remaining in the dark about the nature of problem it intends to solve.

The heuristic I suggest you to consider is, value tools that illuminate their problem domain, instead of tools that obscure their problem domain behind purported features.

The Twin Approach

To overcome the two problems of declarative programming, which I have presented here, I propose a twin approach:

  • Use a data structure domain specific language (dsDSL), to overcome separateness.
  • Create a high level that unfolds from the lower level, to overcome the complexity gap.

dsDSL

A data structure DSL (dsDSL) is a DSL that is built with the data structures of a programming language. The core idea is to use the basic data structures you have available, such as strings, numbers, arrays, objects and functions, and combine them to create abstractions to deal with a specific domain.

We want to keep the power of declaring structures or actions (high level) without having to specify the patterns that implement these constructs (low level). We want to overcome the separateness between the DSL and our programming language so that we are free to use the full power of a programming language whenever we need it. This is not only possible but straightforward through dsDSLs.

If you asked me a year ago, I would have thought that the concept of dsDSL was novel, then one day, I realized that JSON itself was a perfect example of this approach! A parsed JSON object consists of data structures that declaratively represent data entries in order to get the advantages of the DSL while also making it easy to parse and handle from within a programming language. (There might be other dsDSLs out there, but so far I haven’t come across any. If you know of one, I would really appreciate your mentioning it in the comments section.)

Like JSON, a dsDSL has the following attributes:

  1. It consists of a very small set of functions: JSON has two main functions, parse and stringify.
  2. Its functions most commonly receive complex and recursive arguments: a parsed JSON is an array, or object, which usually contains further arrays and objects inside.
  3. The inputs to these functions conform to very specific forms: JSON has an explicit and strictly enforced validation schema to tell valid from invalid structures.
  4. Both the inputs and the outputs of these functions can be contained and generated by a programming language without a separate syntax.

But dsDSLs go beyond JSON in many ways. Let’s create a dsDSL for generating HTML using Javascript. Later I will touch on the issue of whether this approach may be extended to other languages (spoiler: It can definitely be done in Ruby and Python, but probably not in C).

HTML is a markup language composed of tags delimited by angle brackets (< and >). These tags may have optional attributes and contents. Attributes are simply a list of key/value attributes, and contents may be either text or other tags. Both attributes and contents are optionals for any given tag. I’m simplifying somewhat, but it is accurate.

A straightforward way to represent an HTML tag in a dsDSL is by using an array with three elements: – Tag: a string. – Attributes: an object (of the plain, key/value type) or undefined (if no attributes are necessary). – Contents: a string (text), an array (another tag) or undefined (if there’s no contents).

For example, <a href="views">Index</a> can be written as ['a', {href: 'views'}, 'Index'].

If we want to embed this anchor element into a div with class links, we can write: ['div', {class: 'links'}, ['a', {href: 'views'}, 'Index']].

To list several html tags at the same level, we can wrap them in an array:

[
   ['h1', 'Hello!'],
   ['a', {href: 'views'}, 'Index']
]

The same principle may be applied to creating multiple tags within a tag:

['body', [
   ['h1', 'Hello!'],
   ['a', {href: 'views'}, 'Index']
]]

Of course, this dsDSL won’t get us far if we don’t generate HTML from it. We need a generate function which will take our dsDSL and yield a string with HTML. So if we run generate (['a', {href: 'views'}, 'Index']), we will get the string <a href="views">Index</a>.

The idea behind any DSL is to specify a few constructs with a specific structure which is then passed to a function. In this case, the structure that makes up the dsDSL is this array, which has one to three elements; these arrays have a specific structure. If generate thoroughly validates its input (and it is both easy and important to thoroughly validate input, since these validation rules are the precise analog of a DSL’s syntax), it will tell you exactly where you went wrong with your input. After a while, you’ll start to recognize what distinguishes a valid structure in a dsDSL, and this structure will be highly suggestive of the underlying thing it generates.

Now, what are the merits of a dsDSL in contraposition to a DSL?

  • A dsDSL is an integral part of your code. It leads to lower line counts, file counts, and an overall reduction of overhead.
  • dsDSLs are easy to parse (hence easier to implement and modify). Parsing is merely iterating through the elements of an array or object. Likewise, dsDSLs are comparatively easy to design because instead of creating a new syntax (that everybody will hate) you can stick with the syntax of your programming language (which everybody hates but at least they already know it).
  • A dsDSL has all the power of a programming language. This means that a dsDSL, when properly employed, has the advantage of both a high and a low level tool.

Now, the last claim is a strong one, so I’m going to spend the rest of this section supporting it. What do I mean by properly employed? To see this in action, let’s consider an example in which we want to construct a table to display the information from an array named DATA.

var DATA = [
   {id: 1, description: 'Product 1', price: 20,  onSale: true,  categories: ['a']},
   {id: 2, description: 'Product 2', price: 60,  onSale: false, categories: ['b']},
   {id: 3, description: 'Product 3', price: 120, onSale: false, categories: ['a', 'c']},
   {id: 4, description: 'Product 4', price: 45,  onSale: true,  categories: ['a', 'b']}
]

In a real application, DATA will be generated dynamically from a database query.

Moreover, we have a FILTER variable which, when initialized, will be an array with the categories we want to display.

We want our table to:

  • Display table headers.
  • For each product, show the fields: description, price and categories.
  • Don’t print the id field, but add it as an id attribute for each row. ALTERNATE VERSION: Add an id attribute to each tr element.
  • Place a class onSale if the product is on sale.
  • Sort the products by descending price.
  • Filter certain products by category. If FILTER is an empty array, we will display all products. Otherwise, we will only display the products where the category of the product is contained within FILTER.

We can create the presentation logic that matches this requirement in ~20 lines of code:

function drawTable (DATA, FILTER) {

   var printableFields = ['description', 'price', 'categories'];

   DATA.sort (function (a, b) {return a.price - b.price});

   return ['table', [
      ['tr', dale.do (printableFields, function (field) {
         return ['th', field];
      })],
      dale.do (DATA, function (product) {
         var matches = (! FILTER || FILTER.length === 0) || dale.stop (product.categories, true, function (category) {
            return FILTER.indexOf (category) !== -1;
         });

         return matches === false ? [] : ['tr', {
            id: product.id,
            class: product.onSale ? 'onsale' : undefined
         }, dale.do (printableFields, function (field) {
            return ['td', product [field]];
         })];
      })
   ]];
}

I concede this is not a straightforward example, however, it represents a fairly simple view of the four basic functions of persistent storage, also known as CRUD. Any non-trivial web application will have views that are more complex than this.

Let’s now see what this code is doing. First, it defines a function, drawTable, to contain the presentation logic of drawing the product table. This function receives DATA and FILTER as parameters, so it can be used for different data sets and filters. drawTable fulfills the double role of partial and helper.

   var drawTable = function (DATA, FILTER) {

The inner variable, printableFields, is the only place where you need to specify which fields are printable ones, avoiding repetition and inconsistencies in the face of changing requirements.

   var printableFields = ['description', 'price', 'categories'];

We then sort DATA according to the price of its products. Notice that different and more complex sort criteria would be straightforward to implement since we have the entire programming language at our disposal.

   DATA.sort (function (a, b) {return a.price - b.price});

Here we return an object literal; an array which contains table as its first element and its contents as the second. This is the dsDSL representation of the <table> we want to create.

   return ['table', [

We now create a row with the table headers. To create its contents, we use dale.do which is a function like Array.map, but one that also works for objects. We will iterate printableFields and generate table headers for each of them:

      ['tr', dale.do (printableFields, function (field) {
         return ['th', field];
      })],

Notice that we have just implemented iteration, the workhorse of HTML generation, and we didn’t need any DSL constructs; we only needed a function to iterate a data structure and return dsDSLs. A similar native, or user-implemented function, would have done the trick as well.

Now iterate through the products contained in DATA.

      dale.do (DATA, function (product) {

We check whether this product is left out by FILTER. If FILTER is empty, we will print the product. If FILTER is not empty, we will iterate through the categories of the product until we find one that is contained within FILTER. We do this using dale.stop.

         var matches = (! FILTER || FILTER.length === 0) || dale.stop (product.categories, true, function (category) {
            return FILTER.indexOf (category) !== -1;
         });

Notice the intricacy of the conditional; it is precisely tailored to our requirement and we have total freedom for expressing it because we are in a programming language rather than a DSL.

If matches is false, we return an empty array (so we don’t print this product). Otherwise, we return a <tr> with its proper id and class and we iterate through printableFields to, well, print the fields.


         return matches === false ? [] : ['tr', {
            id: product.id,
            class: product.onSale ? 'onsale' : undefined
         }, dale.do (printableFields, function (field) {
            return ['td', product [field]];

Of course we close everything that we opened. Isn’t syntax fun?

         })];
      })
   ]];
}

Now, how do we incorporate this table into a wider context? We write a function named drawAll that will invoke all functions that generate the views. Apart from drawTable, we might also have drawHeaderdrawFooter and other comparable functions, all of which will return dsDSLs.

var drawAll = function () {
   return generate ([
      drawHeader (),
      drawTable (DATA, FILTER),
      drawFooter ()
   ]);
}

If you don’t like how the above code looks, nothing I say will convince you. This is a dsDSL at its best. You might as well stop reading the article (and drop a mean comment too because you’ve earned the right to do so if you’ve made it this far!). But seriously, if the code above doesn’t strike you as elegant, nothing else in this article will.

For those who are still with me, I would like to go back to the main claim of this section, which is that a dsDSL has the advantages of both the high and the low level:

  • The advantage of the low level resides in writing code whenever we want, getting out of the straightjacket of the DSL.
  • The advantage of the high level resides in using literals that represent what we want to declare and letting the functions of the tool convert that into the desired end state (in this case, a string with HTML).

But how is this truly different from purely imperative code? I think ultimately the elegance of the dsDSL approach boils down to the fact that code written in this way mostly consists of expressions, instead of statements. More precisely, code that uses a dsDSL is almost entirely composed of:

  • Literals that map to lower level structures.
  • Function invocations or lambdas within those literal structures that return structures of the same kind.

Code that consists mostly of expressions and which encapsulate most statements within functions is extremely succinct because all patterns of repetition can be easily abstracted. You can write arbitrary code as long as that code returns a literal that conforms to a very specific, non-arbitrary form.

A further characteristic of dsDSLs (which we don’t have time to explore here) is the possibility of using types to increase the richness and succinctness of the literal structures. I will expound on this issue on a future article.

Might it be possible to create dsDSLs beyond Javascript, the One True Language? I think that it is, indeed, possible, as long as the language supports:

  • Literals for: arrays, objects (associative arrays), function invocations, and lambdas.
  • Runtime type detection
  • Polymorphism and dynamic return types

I think this means that dsDSLs are tenable in any modern dynamic language (i.e.: Ruby, Python, Perl, PHP), but probably not in C or Java.

Walk, Then Slide: How To Unfold The High From The Low

In this section I will attempt to show a way for unfolding a high level tool from its domain. In a nutshell, the approach consists of the following steps

  1. Take two to four problems that are representative instances of a problem domain. These problems should be real. Unfolding the high level from the low one is a problem of induction, so you need real data to come up with representative solutions.
  2. Solve the problems with no tool in the most straightforward way possible.
  3. Stand back, take a good look at your solutions, and notice the common patterns among them.
  4. Find the patterns of representation (high level).
  5. Find the patterns of generation (low level).
  6. Solve the same problems with your high level layer and verify that the solutions are indeed correct.
  7. If you feel that you can easily represent all the problems with your patterns of representation, and the generation patterns for each of these instances produce correct implementations, you’re done. Otherwise, go back to the drawing board.
  8. If new problems appear, solve them with the tool and modify it accordingly.
  9. The tool should converge asymptotically to a finished state, no matter how many problems it solves. In other words, the complexity of the tool should remain constant, rather than growing with the amount of problems it solves.

Now, what the hell are patterns of representation and patterns of generation? I’m glad you asked. The patterns of representation are the patterns in which you should be able to express a problem that belongs to the domain that concerns your tool. It is an alphabet of structures that allows you to write any pattern you might wish to express within its domain of applicability. In a DSL, these would be the production rules. Let’s go back to our dsDSL for generating HTML.

The humble HTML tag is a good example of patterns of representation. Let’s take a closer look at these basic patterns.
The humble HTML tag is a good example of patterns of representation. Let’s take a closer look at these basic patterns.

The patterns of representation for HTML are the following:

  • A single tag: ['TAG']
  • A single tag with attributes: ['TAG', {attribute1: value1, attribute2: value2, ...}]
  • A single tag with contents: ['TAG', 'CONTENTS']
  • A single tag with both attributes and contents: ['TAG', {attribute1: value1, ...}, 'CONTENTS']
  • A single tag with another tag inside: ['TAG1', ['TAG2', ...]]
  • A group of tags (standalone or inside another tag): [['TAG1', ...], ['TAG2', ...]]
  • Depending on a condition, place a tag or no tag: condition ? ['TAG', ...] : [] / Depending on a condition, place an attribute or no attribute: ['TAG', {class: condition ? 'someClass': undefined}, ...]

These instances can be represented with the dsDSL notation we determined in the previous section. And this is all you need to represent any HTML you might need. More sophisticated patterns, such as conditional iteration through an object to generate a table, may be implemented with functions that return the patterns of representation above, and these patterns map directly to HTML tags.

If the patterns of representation are the structures you use to express what you want, the patterns of generation are the structures your tool will use to convert patterns of representation into the lower level structures. For HTML, these are the following:

  • Validate the input (this is actually is an universal pattern of generation).
  • Open and close tags (but not the void tags, like <input>, which are self-closing).
  • Place attributes and contents, escaping special characters (but not the contents of the <style> and <script> tags).

Believe it or not, these are the patterns you need to create an unfolding dsDSL layer that generates HTML. Similar patterns can be found for generating CSS. In fact, lith does both, in ~250 lines of code.

One last question remains to be answered: What do I mean by walk, then slide? When we deal with a problem domain, we want to use a tool that delivers us from the nasty details of that domain. In other words, we want to sweep the low level under the rug, the faster the better. The walk, then slide approach proposes exactly the opposite: spend some time on the low level. Embrace its quirks, and understand which are essential and which can be avoided in the face of a set of real, varied, and useful problems.

After walking in the low level for some time and solving useful problems, you will have a sufficiently deep understanding of their domain. The patterns of representation and generation will then arise naturally; they are wholly derived from the nature of the problem they intend to solve. You can then write code that employs them. If they work, you will be able to slide through problems where you recently had to walk through them. Sliding means many things; it implies speed, precision and lack of friction. Maybe more importantly, this quality can be felt; when solving problems with this tool, do you feel like you’re walking through the problem, or do you feel that you’re sliding through it?

Maybe the most important thing about an unfolded tool is not the fact that it frees us from having to deal with the low level. Rather, by capturing the empiric patterns of repetition in the low level, a good high level tool allows us to understand fully the domain of applicability.

An unfolded tool will not just solve a problem – it will enlighten you about the problem’s structure.

So, don’t run away from a worthy problem. First walk around it, then slide through it.

This post was written by Michael Karchevsky, C++ Developer for Toptal.

Competitions are a great way to level up machine learning skills. Not only do you get access to quality datasets, you are also given clear goals. This helps you focus on the important part: designing quality solutions for problems at hand.

I and a friend of mine recently took part in the N+1 fish, N+2 fish competition. This machine learning competition, with lots of image processing, requires you to process video clips of fish being identified, measured, and kept or thrown back into the sea.

an abstract image of machine learning used to identify and measure fish

In the article, I will walk you through how we approached the problem from the competition using standard image processing techniques and pre-trained neural network models. The performance of the submitted solutions was measured based on a certain formula. With our solution, we managed to secure 11th place.

For a brief introduction to machine learning, you can refer to this article.

About the Competition

We were provided with videos of one or many fish in each segment. These videos were captured on different boats fishing for ground fish in the Gulf of Maine.

The videos were collected from fixed-position cameras placed to look down on a ruler. A fish is placed on the ruler, the fisherman removes their hands from the ruler, and then the fisherman either discards or keeps the fish based on the species and size.

a sample video clip from the project

Performance Metric

There were three tasks that were important to this project. The ultimate goal was to create an algorithm that automatically generates annotations for the video files, where the annotations are comprised of:

  • The sequence of fish that appear
  • The species of each fish that appear in the video
  • The length of each fish that appear in the video

The organizers of the competition created an aggregated metric that gave a general sense of performance on all of these tasks. The metric was a simple weighted combination of an individual metric for each of the tasks. While there were certain weights, they recommended that we focus on a well-rounded algorithm that was able to contribute to all of the tasks!

You can learn more about how the overall performance metric is calculated from the performance metrics of each individual task from the official competition web page.

Designing a Machine Learning Solution

When working with machine learning projects dealing with pictures or videos, you will most likely be using convolutional neural networks. But, before we could use convolutional neural networks, we had to preprocess the frames and solve some other subtasks through different strategies.

For training, we used one nVidia 1080Ti GPU. A good chunk of our time was lost trying to optimize our code in order to stay relevant in the competition. We did, however, end up spending less time where it would have mattered more.

Stage 0: Finding Out the Number of Unique Boats

With silhouette analysis, finding the number of boats became a fairly trivial task. The steps were as follows, and leveraged some very standard techniques:

  1. Get some random frames from each video.
  2. Calculate statistics and Speeded Up Robust Features (SURF) for each image.
  3. Using silhouette analysis for K-means clustering, we can find the number of boats.

SURF detects points of interest in an image and generates feature descriptions. This approach is really robust, even with various image transformations.

Once the features of the points of interest in the image are known, K-means clustering is performed, followed by silhouette analysis to determine an approximate number of boats in the images.

Stage 1: Identifying Repeated Frames

Although the dataset contained separate video files, each video seemed to have some overlaps with other videos in the dataset. This is possibly because the videos were split from one long video and thus ended up having a few common frames at the start or end of each video.

a graphic representation of common frames

To identify such frames and remove them as necessary, we used some quick hash functions on the frames.

Stage 2: Locating the Ruler

By applying some standard image processing methods, we located the position of the ruler and its orientation. We then rotated and cropped the image to position the ruler in a consistent manner across all frames. This also allowed to us reduce the frame size tenfold.

Detected ruler (plotted on the mean frame):

a visual depiction of the ruler detection process, machine learning

Cropped and rotated area:

a photograph of the cropped ruler

Stage 3: Determining the Sequence of the Fish

Implementing this stage to determine the sequence of the fish took a majority of my time during this competition. Training new convolutional neural networks seemed too expensive, so we decided to use pre-trained neural networks.

For this, we chose the following neural networks:

These neural network models are trained on the ImageNet dataset.

We extracted only the convolutional layers of the models and passed through them the competition dataset. At the output, I had a fairly compact array of features.

Then, we trained the neural networks with only fully connected dense layers and predicted results for each pretrained model. After that, we averaged the result, and the results turned out quite poor.

We decided to replace it with Long short-term memory (LSTM) neural networks for better prediction where the input data was a sequence of five frames which were transformed with the pretrained models.

To merge the output of all the models, we used the geometric mean.

The fish detection pipeline was:

  1. Generate features with pretrained models.
  2. Predict the probability of fish appearance on a dense neural network.
  3. Generate LSTM features with pretrained models.
  4. Predict the probability of fish appearance on an LSTM neural network.
  5. Merge models using the geometric mean.

The result for one video looks something like this:

a sample fish detection result shown on a graph with frame index along the x-axis and probability along the y-axis

Stage 4: Identify the Species of the Fish

After spending a majority of the contest duration implementing the previous stage, we tried to make up for the lost time working with models from the previous stage to identify the species of the fish.

Our approach for that was roughly as follows:

  1. Add dense layers to convolutional pretrained models VGG16, VGG19, ResNet50, Xception, InceptionV3 layers (weights of convolutional layers were fixed).
  2. Train models with small image augmentation.
  3. Predict species with each model.
  4. Сonsolidate models by voting.

Stage 5: Detect the Length of the Fish

To determine the length of the fish, we used to neural networks. One of them was trained to identify the fish heads and the other was trained to identify fish tails. The lengths of the fish were approximated as the distance between the two points identified by the two neural networks.

photograph showing the distance between a fish head and tail

Complete Scheme

Here is what the overall scheme of the stages looked like:

flowchart depicting the complete scheme

The overall design was fairly simple as video frames were passed through the stages outlined above before combining the separate results.

UNDE