This post was written by Jeremy Davis, JavaScript Developer for Toptal.

CSS was designed for documents, what the “old web” was expected to contain. The emergence of pre-processors like Sass or Less shows that the community needs more than what CSS offers. With web apps getting more and more complex over time, CSS’ limitations have become increasingly visible and difficult to mitigate.

Styled-components leverages the power of a complete programming language—JavaScript—and its scoping capabilities to help structure the code into components. This helps to avoid the common pitfalls of writing and maintaining CSS for large projects. A developer can describe a component’s style with no risk of side effects.

What’s the Problem?

One advantage of using CSS is that the style is completely decoupled from the code. This means that developers and designers can work in parallel without interfering with each other.

On the other hand, styled-components makes it easier to fall in the trap of strongly coupling style and logic. Max Stoiber explains how to avoid this. While the idea of separating logic and presentation is definitely not new, one might be tempted to take shortcuts when developing React components. For example, it’s easy to create a component for a validation button that handles the click action as well as the button’s style. It takes a bit more effort to split it in two components.

The Container/Presentational Architecture

This is a pretty simple principle. Components either define how things look, or they manage data and logic. A very important aspect of presentation components is that they should never have any dependencies. They receive props and render DOM (or children) accordingly. Containers, on the other hand, know about the data architecture (state, redux, flux, etc.) but should never be responsible for display. Dan Abramov’s article is a very good and detailed explanation of this architecture.

Remembering SMACSS

Although the Scalable and Modular Architecture for CSS is a style guide for organizing CSS, the basic concept is one that is followed, for the most part automatically, by styled-components. The idea is to separate CSS into five categories:

  • Base contains all the general rules.
  • Layout’s purpose it to define the structural properties as well as various sections of content (header, footer, sidebar, content, for instance).
  • Module contains subcategories for the various logical blocks of the UI.
  • State defines modifier classes to indicate elements’ states, e.g. field in error, disabled button.
  • Theme contains color, font, and other cosmetic aspects that may be modifiable or dependent on user preference.

Keeping this separation while using styled-components is easy. Projects usually include some kind of CSS normalization or reset. This typically falls in the Base category. You may also define general font sizing, line sizing, etc. This can be done through normal CSS (or Sass/Less), or through the injectGlobal function provided by styled-components.

For Layout rules, if you use a UI framework, then it will probably define container classes, or a grid system. You can easily use those classes in conjunction with your own rules in the layout components you write.

Module is automatically followed by the architecture of styled-components, since the styles are attached to components directly, rather than described in external files. Basically, each styled component that you write will be its own module. You can write your styling code without worrying about side effects.

State will be rules you define within your components as variable rules. You simply define a function to interpolate values of your CSS attributes. If using a UI framework, you might have useful classes to add to your components as well. You will probably also have CSS pseudo-selector rules (hover, focus, etc.)

The Theme can simply be interpolated within your components. It is a good idea to define your theme as a set of variables to be used throughout your application. You can even derive colors programmatically (using a library, or manually), for instance to handle contrasts and highlights. Remember that you have the full power of a programming language at your disposal!

Bring Them Together for a Solution

It is important to keep them together, for an easier navigation experience; We don’t want to organize them by type (presentation vs. logic) but rather by functionality.

Thus, we will have a folder for all the generic components (buttons and such). The others should be organized depending on the project and its functionalities. For instance, if we have user management features, we should group all the components specific to that feature.

To apply styled-components’ container/presentation architecture to a SMACSS approach, we need an extra type of component: structural. We end up with three kinds of components; styled, structural, and container. Since styled-components decorates a tag (or component), we need this third type of component to structure the DOM. In some cases, it might be possible to allow a container component to handle the structure of sub-components, but when the DOM structure becomes complex and is required for visual purposes, it’s best to separate them. A good example is a table, where the DOM typically gets quite verbose.

Example Project

Let’s build a small app that displays recipes to illustrate these principles. We can start building a Recipes component. The parent component will be a controller. It will handle the state—in this case, the list of recipes. It will also call an API function to fetch the data.

class Recipes extends Component{
  constructor (props) {
    super(props);
    this.state = {
      recipes: []
    };
  }

  componentDidMount () {
    this.loadData()
  }

  loadData () {
    getRecipes().then(recipes => {
      this.setState({recipes})
    })
  }

  render() {
    let {recipes} = this.state

    return (
      <RecipesContainer recipes={recipes} />
    )
  }
}

It will render the list of recipes, but it does not (and should not) need to know how. So we render another component that gets the list of recipes and outputs DOM:

class RecipesContainer extends Component{
  render() {
    let {recipes} = this.props

    return (
      <TilesContainer>
        {recipes.map(recipe => (<Recipe key={recipe.id} {...recipe}/>))}
      </TilesContainer>
    )
  }
}

Here, actually, we want to make a tile grid. It may be a good idea to make the actual tile layout a generic component. So if we extract that, we get a new component that looks like this:

class TilesContainer extends Component {
  render () {
    let {children} = this.props

    return (
      <Tiles>
        {
          React.Children.map(children, (child, i) => (
            <Tile key={i}>
              {child}
            </Tile>
          ))
        }
      </Tiles>
    )
  }
}

TilesStyles.js:

export const Tiles = styled.div`
  padding: 20px 10px;
  display: flex;
  flex-direction: row;
  flex-wrap: wrap;
`

export const Tile = styled.div`
  flex: 1 1 auto;
  ...
  display: flex;

  & > div {
    flex: 1 0 auto;
  }
`

Notice that this component is purely presentational. It defines its style and wraps whatever children it receives inside another styled DOM element that defines what tiles look like. It’s a good example of what your generic presentational components will look like architecturally.

Then, we need to define what a recipe looks like. We need a container component to describe the relatively complex DOM as well as define the style when necessary. We end up with this:

class RecipeContainer extends Component {
  onChangeServings (e) {
    let {changeServings} = this.props
    changeServings(e.target.value)
  }

  render () {
    let {title, ingredients, instructions, time, servings} = this.props

    return (
      <Recipe>
        <Title>{title}</Title>
        <div>{time}</div>
        <div>Serving
          <input type="number" min="1" max="1000" value={servings} onChange={this.onChangeServings.bind(this)}/>
        </div>
        <Ingredients>
          {ingredients.map((ingredient, i) => (
            <Ingredient key={i} servings={servings}>
              <span className="name">{ingredient.name}</span>
              <span className="quantity">{ingredient.quantity * servings} {ingredient.unit}</span>
            </Ingredient>
          ))}
        </Ingredients>
        <div>
          {instructions.map((instruction, i) => (<p key={i}>{instruction}</p>))}
        </div>
      </Recipe>
    )
  }
}

Notice here that the container does some DOM generation, but it’s the only logic it contains. Remember that you can define nested styles, so you don’t need to make a styled element for each tag that requires styling. It’s what we do here for the name and quantity of the ingredient item. Of course, we could split it further and create a new component for an ingredient. That is up to you—depending on the project complexity—to determine the granularity. In this case, it is just a styled component defined along with the rest in the RecipeStyles file:

export const Recipe = styled.div`
  background-color: ${theme('colors.background-highlight')};
`;

export const Title = styled.div`
  font-weight: bold;
`

export const Ingredients = styled.ul`
  margin: 5px 0;
`

export const Ingredient = styled.li`
  & .name {
    ...
  }

  & .quantity {
    ...
  }
`

For the purpose of this exercise, I have used the ThemeProvider. It injects the theme in the props of styled components. You can simply use it as color: ${props => props.theme.core_color}, I am just using a small wrapper to protect from missing attributes in the theme:

const theme = (key) => (prop) => _.get(prop.theme, key) || console.warn('missing key', key)

You can also define your own constants in a module and use those instead. For example: color: ${styleConstants.core_color}

Pros

A perk of using styled-components is that you can use it as little as you want. You can use your favorite UI framework and add styled-components on top of it. This also means that you can easily migrate an existing project component by component. You can choose to style most of the layout with standard CSS and only use styled-components for reusable components.

Cons

Designers/style integrators will need to learn very basic JavaScript to handle variables and use them in place of Sass/Less.

They will also have to learn to navigate the project structure, although I would argue that finding the styles for a component in that component’s folder is easier than having to find the right CSS/Sass/Less file that contains the rule you need to modify.

They will also need to change their tools a bit if they want syntax highlighting, linting, etc. A good place to start is with this Atom plugin and this babel plugin.

 

This post was written by Marcus McCurdy, Software Engineer for Toptal

Discussions criticizing Python often talk about how it is difficult to use Python for multithreaded work, pointing fingers at what is known as the global interpreter lock (affectionately referred to as the GIL) that prevents multiple threads of Python code from running simultaneously. Due to this, the Python multithreading module doesn’t quite behave the way you would expect it to if you’re not a Python developer and you are coming from other languages such as C++ or Java. It must be made clear that one can still write code in Python that runs concurrently or in parallel and make a stark difference in resulting performance, as long as certain things are taken into consideration. If you haven’t read it yet, I suggest you take a look at Eqbal Quran’s article on concurrency and parallelism in Ruby here on the Toptal Engineering Blog.

In this Python concurrency tutorial, we will write a small Python script to download the top popular images from Imgur. We will start with a version that downloads images sequentially, or one at a time. As a prerequisite, you will have to register an application on Imgur. If you do not have an Imgur account already, please create one first.

The scripts in this tutorial have been tested with Python 3.6.4. With some changes, they should also run with Python 2—urllib is what has changed the most between these two versions of Python.

Getting Started with Python Multithreading

Let us start by creating a Python module, named download.py. This file will contain all the functions necessary to fetch the list of images and download them. We will split these functionalities into three separate functions:

  • get_links
  • download_link
  • setup_download_dir

The third function, setup_download_dir, will be used to create a download destination directory if it doesn’t already exist.

Imgur’s API requires HTTP requests to bear the Authorization header with the client ID. You can find this client ID from the dashboard of the application that you have registered on Imgur, and the response will be JSON encoded. We can use Python’s standard JSON library to decode it. Downloading the image is an even simpler task, as all you have to do is fetch the image by its URL and write it to a file.

This is what the script looks like:

import json
import logging
import os
from pathlib import Path
from urllib.request import urlopen, Request

logger = logging.getLogger(__name__)

def get_links(client_id):
   headers = {'Authorization': 'Client-ID {}'.format(client_id)}
   req = Request('https://api.imgur.com/3/gallery/', headers=headers, method='GET')
   with urlopen(req) as resp:
       data = json.loads(resp.readall().decode('utf-8'))
   return map(lambda item: item['link'], data['data'])

def download_link(directory, link):
   logger.info('Downloading %s', link)
   download_path = directory / os.path.basename(link)
   with urlopen(link) as image, download_path.open('wb') as f:
       f.write(image.readall())

def setup_download_dir():
   download_dir = Path('images')
   if not download_dir.exists():
       download_dir.mkdir()
   return download_dir

Next, we will need to write a module that will use these functions to download the images, one by one. We will name this single.py. This will contain the main function of our first, naive version of the Imgur image downloader. The module will retrieve the Imgur client ID in the environment variable IMGUR_CLIENT_ID. It will invoke the setup_download_dir to create the download destination directory. Finally, it will fetch a list of images using the get_links function, filter out all GIF and album URLs, and then use download_link to download and save each of those images to the disk. Here is what single.py looks like:

import logging
import os
from time import time

from download import setup_download_dir, get_links, download_link

logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logging.getLogger('requests').setLevel(logging.CRITICAL)
logger = logging.getLogger(__name__)

def main():
   ts = time()
   client_id = os.getenv('IMGUR_CLIENT_ID')
   if not client_id:
       raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
   download_dir = setup_download_dir()
   links = [l for l in get_links(client_id) if l.endswith('.jpg')]
   for link in links:
       download_link(download_dir, link)
   print('Took {}s'.format(time() - ts))

if __name__ == '__main__':
   main()

On my laptop, this script took 19.4 seconds to download 91 images. Please do note that these numbers may vary based on the network you are on. 19.4 seconds isn’t terribly long, but what if we wanted to download more pictures? Perhaps 900 images, instead of 90. With an average of 0.2 seconds per picture, 900 images would take approximately 3 minutes. For 9000 pictures it would take 30 minutes. The good news is that by introducing concurrency or parallelism, we can speed this up dramatically.

All subsequent code examples will only show import statements that are new and specific to those examples. For convenience, all of these Python scripts can be found in this GitHub repository.

Using Threads for Concurrency and Parallelism

Threading is one of the most well-known approaches to attaining Python concurrency and parallelism. Threading is a feature usually provided by the operating system. Threads are lighter than processes, and share the same memory space.

Python multithreading memory model

In our Python threading tutorial, we will write a new module to replace single.py. This module will create a pool of eight threads, making a total of nine threads including the main thread. I chose eight worker threads because my computer has eight CPU cores and one worker thread per core seemed a good number for how many threads to run at once. In practice, this number is chosen much more carefully based on other factors, such as other applications and services running on the same machine.

This is almost the same as the previous one, with the exception that we now have a new class, DownloadWorker, which is a descendent of the Python Thread class. The run method has been overridden, which runs an infinite loop. On every iteration, it calls self.queue.get() to try and fetch a URL to from a thread-safe queue. It blocks until there is an item in the queue for the worker to process. Once the worker receives an item from the queue, it then calls the same download_link method that was used in the previous script to download the image to the images directory. After the download is finished, the worker signals the queue that that task is done. This is very important, because the Queue keeps track of how many tasks were enqueued. The call to queue.join() would block the main thread forever if the workers did not signal that they completed a task.

from queue import Queue
from threading import Thread

class DownloadWorker(Thread):
   def __init__(self, queue):
       Thread.__init__(self)
       self.queue = queue

   def run(self):
       while True:
           # Get the work from the queue and expand the tuple
           directory, link = self.queue.get()
           download_link(directory, link)
           self.queue.task_done()

def main():
   ts = time()
   client_id = os.getenv('IMGUR_CLIENT_ID')
   if not client_id:
       raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
   download_dir = setup_download_dir()
   links = [l for l in get_links(client_id) if l.endswith('.jpg')]
   # Create a queue to communicate with the worker threads
   queue = Queue()
   # Create 8 worker threads
   for x in range(8):
       worker = DownloadWorker(queue)
       # Setting daemon to True will let the main thread exit even though the workers are blocking
       worker.daemon = True
       worker.start()
   # Put the tasks into the queue as a tuple
   for link in links:
       logger.info('Queueing {}'.format(link))
       queue.put((download_dir, link))
   # Causes the main thread to wait for the queue to finish processing all the tasks
   queue.join()
   print('Took {}'.format(time() - ts))

Running this Python threading example script on the same machine used earlier results in a download time of 4.1 seconds! That’s 4.7 times faster than the previous example. While this is much faster, it is worth mentioning that only one thread was executing at a time throughout this process due to the GIL. Therefore, this code is concurrent but not parallel. The reason it is still faster is because this is an IO bound task. The processor is hardly breaking a sweat while downloading these images, and the majority of the time is spent waiting for the network. This is why Python multithreading can provide a large speed increase. The processor can switch between the threads whenever one of them is ready to do some work. Using the threading module in Python or any other interpreted language with a GIL can actually result in reduced performance. If your code is performing a CPU bound task, such as decompressing gzip files, using the threading module will result in a slower execution time. For CPU bound tasks and truly parallel execution, we can use the multiprocessing module.

While the de facto reference Python implementation—CPython–has a GIL, this is not true of all Python implementations. For example, IronPython, a Python implementation using the .NET framework, does not have a GIL, and neither does Jython, the Java-based implementation. You can find a list of working Python implementations here.

Related: Python Best Practices and Tips by Toptal Developers

Spawning Multiple Processes

The multiprocessing module is easier to drop in than the threading module, as we don’t need to add a class like the Python threading example. The only changes we need to make are in the main function.

Python multiprocessing tutorial: Modules

To use multiple processes, we create a multiprocessing Pool. With the map method it provides, we will pass the list of URLs to the pool, which in turn will spawn eight new processes and use each one to download the images in parallel. This is true parallelism, but it comes with a cost. The entire memory of the script is copied into each subprocess that is spawned. In this simple example, it isn’t a big deal, but it can easily become serious overhead for non-trivial programs.

from functools import partial
from multiprocessing.pool import Pool

def main():
   ts = time()
   client_id = os.getenv('IMGUR_CLIENT_ID')
   if not client_id:
       raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
   download_dir = setup_download_dir()
   links = [l for l in get_links(client_id) if l.endswith('.jpg')]
   download = partial(download_link, download_dir)
   with Pool(8) as p:
       p.map(download, links)
   print('Took {}s'.format(time() - ts))

Distributing to Multiple Workers

While the threading and multiprocessing modules are great for scripts that are running on your personal computer, what should you do if you want the work to be done on a different machine, or you need to scale up to more than the CPU on one machine can handle? A great use case for this is long-running back-end tasks for web applications. If you have some long-running tasks, you don’t want to spin up a bunch of sub-processes or threads on the same machine that need to be running the rest of your application code. This will degrade the performance of your application for all of your users. What would be great is to be able to run these jobs on another machine, or many other machines.

A great Python library for this task is RQ, a very simple yet powerful library. You first enqueue a function and its arguments using the library. This pickles the function call representation, which is then appended to a Redis list. Enqueueing the job is the first step, but will not do anything yet. We also need at least one worker to listen on that job queue.

Model of the RQ Python queue library

The first step is to install and run a Redis server on your computer, or have access to a running Redis server. After that, there are only a few small changes made to the existing code. We first create an instance of an RQ Queue and pass it an instance of a Redis server from the redis-py library. Then, instead of just calling our download_link method, we call q.enqueue(download_link, download_dir, link). The enqueue method takes a function as its first argument, then any other arguments or keyword arguments are passed along to that function when the job is actually executed.

One last step we need to do is to start up some workers. RQ provides a handy script to run workers on the default queue. Just run rqworker in a terminal window and it will start a worker listening on the default queue. Please make sure your current working directory is the same as where the scripts reside in. If you want to listen to a different queue, you can run rqworker queue_name and it will listen to that named queue. The great thing about RQ is that as long as you can connect to Redis, you can run as many workers as you like on as many different machines as you like; therefore, it is very easy to scale up as your application grows. Here is the source for the RQ version:

from redis import Redis
from rq import Queue

def main():
   client_id = os.getenv('IMGUR_CLIENT_ID')
   if not client_id:
       raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
   download_dir = setup_download_dir()
   links = [l for l in get_links(client_id) if l.endswith('.jpg')]
   q = Queue(connection=Redis(host='localhost', port=6379))
   for link in links:
       q.enqueue(download_link, download_dir, link)

However, RQ is not the only Python job queue solution. RQ is easy to use and covers simple use cases extremely well, but if more advanced options are required, other Python 3 queue solutions (such as Celery) can be used.

Python Multithreading vs. Multiprocessing

If your code is IO bound, both multiprocessing and multithreading in Python will work for you. Multiprocessing is a easier to just drop in than threading but has a higher memory overhead. If your code is CPU bound, multiprocessing is most likely going to be the better choice—especially if the target machine has multiple cores or CPUs. For web applications, and when you need to scale the work across multiple machines, RQ is going to be better for you.

Update

Python concurrent.futures

Something new since Python 3.2 that wasn’t touched upon in the original article is the concurrent.futurespackage. This package provides yet another way to use concurrency and parallelism with Python.

In the original article, I mentioned that Python’s multiprocessing module would be easier to drop into existing code than the threading module. This was because the Python 3 threading module required subclassing the Thread class and also creating a Queue for the threads to monitor for work.

Using a concurrent.futures.ThreadPoolExecutor makes the code almost identical to the multiprocessing module.

import logging
import os
from concurrent.futures import ThreadPoolExecutor
from functools import partial
from time import time

from download import setup_download_dir, get_links, download_link

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

logger = logging.getLogger(__name__)


def main():
    client_id = os.getenv('IMGUR_CLIENT_ID')
    if not client_id:
        raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
    download_dir = setup_download_dir()
    links = get_links(client_id)

    # By placing the executor inside a with block, the executors shutdown method
    # will be called cleaning up threads.
    # 
    # By default, the executor sets number of workers to 5 times the number of
    # CPUs.
    with ThreadPoolExecutor() as executor:

        # Create a new partially applied function that stores the directory
        # argument.
        # 
        # This allows the download_link function that normally takes two
        # arguments to work with the map function that expects a function of a
        # single argument.
        fn = partial(download_link, download_dir)

        # Executes fn concurrently using threads on the links iterable. The
        # timeout is for the entire process, not a single call, so downloading
        # all images must complete within 30 seconds.
        executor.map(fn, links, timeout=30)


if __name__ == '__main__':
    main()

Now that we have all these images downloaded with our Python ThreadPoolExecutor, we can use them to test a CPU-bound task. We can create thumbnail versions of all the images in both a single-threaded, single-process script and then test a multiprocessing-based solution.

We are going to use the Pillow library to handle the resizing of the images.

Here is our initial script.

import logging
from pathlib import Path
from time import time

from PIL import Image

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

logger = logging.getLogger(__name__)


def create_thumbnail(size, path):
    """
    Creates a thumbnail of an image with the same name as image but with
    _thumbnail appended before the extension.  E.g.:

    >>> create_thumbnail((128, 128), 'image.jpg')

    A new thumbnail image is created with the name image_thumbnail.jpg

    :param size: A tuple of the width and height of the image
    :param path: The path to the image file
    :return: None
    """
    image = Image.open(path)
    image.thumbnail(size)
    path = Path(path)
    name = path.stem + '_thumbnail' + path.suffix
    thumbnail_path = path.with_name(name)
    image.save(thumbnail_path)


def main():
    ts = time()
    for image_path in Path('images').iterdir():
        create_thumbnail((128, 128), image_path)
    logging.info('Took %s', time() - ts)


if __name__ == '__main__':
    main()

This script iterates over the paths in the images folder and for each path it runs the create_thumbnail function. This function uses Pillow to open the image, create a thumbnail, and save the new, smaller image with the same name as the original but with _thumbnail appended to the name.

Running this script on 160 images totaling 36 million takes 2.32 seconds. Lets see if we can speed this up using a ProcessPoolExecutor.

import logging
from pathlib import Path
from time import time
from functools import partial

from concurrent.futures import ProcessPoolExecutor

from PIL import Image

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

logger = logging.getLogger(__name__)


def create_thumbnail(size, path):
    """
    Creates a thumbnail of an image with the same name as image but with
    _thumbnail appended before the extension. E.g.:

    >>> create_thumbnail((128, 128), 'image.jpg')

    A new thumbnail image is created with the name image_thumbnail.jpg

    :param size: A tuple of the width and height of the image
    :param path: The path to the image file
    :return: None
    """
    path = Path(path)
    name = path.stem + '_thumbnail' + path.suffix
    thumbnail_path = path.with_name(name)
    image = Image.open(path)
    image.thumbnail(size)
    image.save(thumbnail_path)


def main():
    ts = time()
    # Partially apply the create_thumbnail method, setting the size to 128x128
    # and returning a function of a single argument.
    thumbnail_128 = partial(create_thumbnail, (128, 128))

    # Create the executor in a with block so shutdown is called when the block
    # is exited.
    with ProcessPoolExecutor() as executor:
        executor.map(thumbnail_128, Path('images').iterdir())
    logging.info('Took %s', time() - ts)


if __name__ == '__main__':
    main()

The create_thumbnail method is identical to the last script. The main difference is the creation of a ProcessPoolExecutor. The executor’s map method is used to create the thumbnails in parallel. By default, the ProcessPoolExecutor creates one subprocess per CPU. Running this script on the same 160 images took 1.05 seconds—2.2 times faster!

Async/Await (Python 3.5+ only)

One of the most requested items in the comments on the original article was for an example using Python 3’s asyncio module. Compared to the other examples, there is some new Python syntax that may be new to most people and also some new concepts. An unfortunate additional layer of complexity is caused by Python’s built-in urllib module not being asynchronous. We will need to use an async HTTP library to get the full benefits of asyncio. For this, we’ll use aiohttp.

Let’s jump right into the code and a more detailed explanation will follow.

import asyncio
import logging
import os
from time import time

import aiohttp

from download import setup_download_dir, get_links

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)


async def async_download_link(session, directory, link):
    """
    Async version of the download_link method we've been using in the other examples.
    :param session: aiohttp ClientSession
    :param directory: directory to save downloads
    :param link: the url of the link to download
    :return:
    """
    download_path = directory / os.path.basename(link)
    async with session.get(link) as response:
        with download_path.open('wb') as f:
            while True:
                # await pauses execution until the 1024 (or less) bytes are read from the stream
                chunk = await response.content.read(1024)
                if not chunk:
                    # We are done reading the file, break out of the while loop
                    break
                f.write(chunk)
    logger.info('Downloaded %s', link)


# Main is now a coroutine
async def main():
    client_id = os.getenv('IMGUR_CLIENT_ID')
    if not client_id:
        raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
    download_dir = setup_download_dir()
    # We use a session to take advantage of tcp keep-alive
    # Set a 3 second read and connect timeout. Default is 5 minutes
    async with aiohttp.ClientSession(conn_timeout=3, read_timeout=3) as session:
        tasks = [(async_download_link(session, download_dir, l)) for l in get_links(client_id)]
        # gather aggregates all the tasks and schedules them in the event loop
        await asyncio.gather(*tasks, return_exceptions=True)


if __name__ == '__main__':
    ts = time()
    # Create the asyncio event loop
    loop = asyncio.get_event_loop()
    try:
        loop.run_until_complete(main())
    finally:
        # Shutdown the loop even if there is an exception
        loop.close()
    logger.info('Took %s seconds to complete', time() - ts)

There is quite a bit to unpack here. Let’s start with the main entry point of the program. The first new thing we do with the asyncio module is to obtain the event loop. The event loop handles all of the asynchronous code. Then, the loop is run until complete and passed the main function. There is a piece of new syntax in the definition of main: async def. You’ll also notice await and with async.

The async/await syntax was introduced in PEP492. The async def syntax marks a function as a coroutine. Internally, coroutines are based on Python generators, but aren’t exactly the same thing. Coroutines return a coroutine object similar to how generators return a generator object. Once you have a coroutine, you obtain its results with the await expression. When a coroutine calls await, execution of the coroutine is suspended until the awaitable completes. This suspension allows other work to be completed while the coroutine is suspended “awaiting” some result. In general, this result will be some kind of I/O like a database request or in our case an HTTP request.

The download_link function had to be changed pretty significantly. Previously, we were relying on urllib to do the brunt of the work of reading the image for us. Now, to allow our method to work properly with the async programming paradigm, we’ve introduced a while loop that reads chunks of the image at a time and suspends execution while waiting for the I/O to complete. This allows the event loop to loop through downloading the different images as each one has new data available during the download.

There Should Be One—Preferably Only One—Obvious Way to Do It

While the zen of Python tells us there should be one obvious way to do something, there are many ways in Python to introduce concurrency into our programs. The best method to choose is going to depend on your specific use case. The asynchronous paradigm scales better to high-concurrency workloads (like a webserver) compared to threading or multiprocessing, but it requires your code (and dependencies) to be async in order to fully benefit.

Hopefully this article—and update—will point you in the right direction so you have an idea of where to look in the Python standard library if you need to introduce concurrency into your programs.

About the Author: Marcus has a Bachelor’s in Computer Engineering and a Master’s in Computer Science. He is a talented programmer, and excels most at back-end development. However, he is comfortable creating polished products as a full stack developer.

This post was written by Filip Petkovski, Software Engineer for Toptal

Mobile applications are creeping in everywhere, starting with smartphones and tablets, to smart watches, and soon be found in other wearables, too. However, developing for each separate mobile platform can be an exhaustive task, especially if your resources are limited, or if you are a single developer. This is where becoming a well-versed Apache Cordova developer can come in handy by providing a way to develop mobile applications using standard web technologies—HTML5, CSS3, and JavaScript.

In 2009, a startup called Nitobi created PhoneGap, an open source API for accessing native mobile resources, with the goal of enabling developers to create mobile applications using standard web technologies. In Nitobi’s vision, most mobile applications would soon be developed using PhoneGap, but developers would still have the option of writing native code when necessary, be it due to performance issues, or lack of a method of accessing specific hardware.

Cordova PhoneGap?

There’s no such thing, really. What happened was, Adobe acquired Nitobi in 2011, and donated the open-source core to the Apache Software Foundation, who rebranded it Apache Cordova. A common analogy you will often run into is that Cordova is to PhoneGap what WebKit is to Chrome or Safari.

Obviously, the differences between Cordova and PhoneGap were minimal in the beginning. With time, Adobe PhoneGap developed its own set of proprietary features, while Cordova was—and still is—supported by the open-source community. This Apache Cordova review and tutorial will examine Cordova app development in more detail, and while some of it may apply to PhoneGap, this shouldn’t be considered a PhoneGap tutorial, per se.

Apache Cordova Capabilities

In essence, Cordova has no limitations in relation to natively developed applications. What you get with Cordova is simply a JavaScript API, which serves as a wrapper for native code and is consistent across devices. You can consider Cordova to be an application container with a web view, which covers the entire screen of the device. The web view used by Cordova is the same web view used by the native operating system. On iOS, this is the default Objective-C UIWebView or a custom WKWebView class; on Android, this is android.webkit.WebView.

Apache Cordova comes with a set of pre-developed plugins which provide access to the device’s camera, GPS, file system, etc. As mobile devices evolve, adding support for additional hardware is simply a matter of developing new plugins.

Finally, Cordova applications install just like native applications. This means that building your code for iOS will produce an IPA file, for Android an APK file, and building for Windows Phone produces an XAP file. If you put enough effort into the development process, your users might not even realize that they are not using a native application.

Apache Cordova Capabilities

Apache Cordova Development Workflows

There are two basic paths you can follow when developing with Cordova:

  • When your intention is to deploy an application to as many platforms as possible, with little or no platform-specific development, you should use the cross-platform workflow. The main tool supporting this workflow is the Cordova Command-Line Interface (CLI), which serves as a higher level abstraction for configuring and building your application for different platforms. This is the more commonly used development path.
  • If you plan to develop your application with a specific platform in mind, you should use the platform-centered workflow. This way, you will be able to tweak and modify your code at a lower level by mixing native components with Cordova components. Even though you could use this approach for cross-platform development, the process will be longer and more tedious.

It is usually recommended to start with the cross-platform development workflow, since switching to platform-centered development is fairly straightforward. However, if you initially start with the platform-centered workflow, you will not be able to switch to cross-platform development since the CLI will overwrite your customizations once you run the build process.

Prerequisites and Cordova Installation

Before installing and running anything related to Cordova, you will need to install the SDK for each platform that you intend to build your application for. We will focus on the Android platform in this article; however, the process involving other platforms is similar.

You should download the Android SDK found here. For Windows, the SDK comes as an installer, while for Linux and OSX it comes as an archive which can be simply extracted. After extracting/installing the package, you will need to add the sdk/tools and sdk/platform-tools directories to your PATH variable. The PATHvariable is used by Cordova to look for the binaries it needs for the build process. If you don’t have Java installed, you should go ahead and install the JDK together with Ant. ANT_HOME and JAVA_HOME should be set to the bin folders of JDK and Ant, and after installing the Android SDK, set the ANDROID_HOME variable to Android/Sdk. All locations in the three *_HOME variables should also be in your PATH variable.

After you installed the SDK android command will become available in your command line. Execute it to open the SDK manager and install the latest tools and Android API. You would likely need Android SDK Tools, Android SDK Platform tools, Android SDK Build-tools, SDK Platform, Google APIs Intel x86 Atom System Image, Sources for Android SDK and Intel x86 Emulator Accelerator (HAXM installer). After that you will be able to create an emulator with android avd.

Cordova CLI depends on Node.js and the Git client, so go ahead and download and install Node from nodejs.org, and Git from git-scm.com. You will be using npm to install Cordova CLI itself as well as for installing additional plugins, and Cordova will use git behind the scenes in order to download required dependencies. Finally, run

npm install -g cordova

…to install the Cordova CLI globally (npm install cordova isn’t sufficient by itself.)

To summarize, these are the packages that you will need:

  • Java
  • Ant
  • Android SDK
  • NodeJS
  • Git

And these environment variables will need to be updated:

  • PATH
  • JAVA_HOME
  • ANT_HOME
  • ANDROID_HOME

Bootstrapping an Application

Provided you have successfully installed Cordova, you should now have access to the Cordova command line utility. Open your terminal or command-line, and navigate to a directory where you would like to create your first Cordova project. To bootstrap an application, type in the following command:

cordova create toptal toptal.hello HelloToptal 

The command line consists of the name of the command cordova, following by the subcommand create. The subcommand is invoked with three additional parameters: The folder where the application will be placed, the namespace of the application, and its display name. This bootstraps the application in a folder with the following structure:

toptal/
|-- hooks/
|-- platforms/
|-- plugins/
|-- www/
`-- config.xml

The www folder contains your application core. This is where you will place your application code which is common for all platforms.

While Cordova allows you to easily develop an app for different platforms, sometimes you need to add customizations. When developing for multiple platforms, you don’t want to modify the source files in the various platforms/[platform-name][assets]/www directories, because they’re regularly overwritten with the top-level www files.

At this point you can also open up the config.xml file and change the metadata for your application, such as author and description.

Add your first platform using:

cordova platform add android

If you change your mind later on, you can remove a platform from the build process easily:

cordova platform rm android

Upon inspecting the platforms directory, you will notice the android folder within it. For each platform that you add, Cordova will create a new directory in platforms and duplicate the www folder within it. If, for example, you want to customize your application for Android, you can modify the files in platforms/android/assets/www and switch to platform-specific shell tools.

However, remember that if you rebuild your application with the CLI (used for cross-platform development), Cordova will overwrite the changes you have made for each platform, so either make sure you have them under version control, or you do platform-specific changes after you have finished with cross-platform development. As we mentioned earlier, moving from cross-platform to platform-specific development is easy. Moving in the other direction is not.

If you want to keep using the cross-platform workflow and still make platform-specific customizations, you should use the top-level merges folder. From Cordova version 3.5 onward, this folder has been removed from the default application template, but if you need it, you can simply create it alongside the other top-level directories (hooksplatformsplugins, and www).

Platform-specific customizations are placed in merges/[platform-name], and are applied after the source files in the top-level www folder. This way, you can either add new source files for certain platforms, or you can override entire top-level source files with platform-specific ones. Take the following structure for example:

merges/         
|-- wp8/        
|    `-- app.js                 
|-- android/        
|    `-- android.js         
|-- www/        
`-- app.js      

In this case, the output file for Android will contain both the app.js and android.js files, but the output file for Windows Phone 8 will only contain the app.js file which is found in the merges/wp8 folder, since the files in merges/[platform] override the files in www.

The plugins directory contains information for each platform’s plugins. At this point, you should only have the android.json file which should have the following structure:

{
    "prepare_queue": {
        "installed": [],
        "uninstalled": []
    },
    "config_munge": {
        "files": {}
    },
    "installed_plugins": {},
    "dependent_plugins": {}
}

Let us build the application and deploy it to an Android device. You can use the emulator too, if you want.

Cordova provides several CLI steps for building and running your apps: cordova preparecordova compilecordova build (which is a shortcut for the previous two), cordova emulate and cordova run (which incorporates build and can run emulator too). This should not confuse you, because in most cases you would like to build and run your app in the emulator:

cordova run --emulator

If you want, you can plug your device in through the USB port, enable USB debugging mode and deploy your first Apache Cordova application straight to your device simply running:

cordova run 

This will copy all of your files into platforms/* and execute all required tasks.

You can limit the scope of the build process by specifying the name of the platform for which you want to build the application and/or even the specific emulator, e.g:

cordova run android --emulator

or

cordova run ios --emulator --target="iPhone-8-Plus"

Hands-on Apache Cordova Tutorial

Let’s create a simple tutorial application to demonstrate the use of Cordova and its plugins. The entire demo can be found in this GitHub repository so that you can download it and go through parts of it along with this short Cordova tutorial.

We will use the initial setup you created and add additional code. Let’s say that we want to add new projects to an imaginary Toptal database, as well as view existing ones. Open up index.html and set up two tabs in the following manner:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8" />
        <meta name="format-detection" content="telephone=no" />
        <meta name="msapplication-tap-highlight" content="no" />
        <meta name="viewport" content="user-scalable=no, initial-scale=1, maximum-scale=1, minimum-scale=1, width=device-width, height=device-height, target-densitydpi=device-dpi" />
        <link rel="stylesheet" type="text/css" href="css/bootstrap.min.css" />
        <link rel="stylesheet" href="css/jquery.mobile-1.4.5.min.css" />
        <link rel="stylesheet" type="text/css" href="css/toptal.css" />
        <title>Hello Toptal</title>
    </head>
    <body>
        <div id="container">
            <div id="tab-content">
                    
            </div>
        </div>
        <footer>
            <ul id="menu">
                <li id="search-tab-button" class="tab-button active" data-tab="#search-tab">Search Projects</li>
                <li id="post-tab-button" class="tab-button" data-tab="#add-tab">Post a Project</li>
            </ul>
        </footer>
        <div id="dev-null" style="display: none"></div>
        <script src="js/lib/jquery-1.11.1.min.js"></script>
        <script src="js/lib/jquery.mobile-1.4.5.min.js"></script>
        <script type="text/javascript" src="cordova.js"></script>
        <script type="text/javascript" src="js/SQLiteStorageService.js"></script>
        <script type="text/javascript" src="js/Controller.js"></script>
        <script type="text/javascript" src="js/index.js"></script>
    </body>
</html>

Notice that I have added Bootstrap and jQuery Mobile as dependencies. Please be aware that much better solutions and frameworks have been developed for building modern hybrid applications, but since most (if not all) web developers are familiar with these two libraries, it makes sense to use them for a beginners’ tutorial. You can download the stylesheets from GitHub or use your own, if you prefer.

Let us move to the index.js file, and strip it down to the following:

var app = {
    // Application Constructor
    initialize: function() {
        if (navigator.userAgent.match(/(iPhone|iPod|iPad|Android|BlackBerry)/)) {
            document.addEventListener("deviceready", this.onDeviceReady, false);
        } else {
            this.onDeviceReady();
        }
    },

    onDeviceReady: function() {
        // We will init / bootstrap our application here
    },
};
app.initialize();

Remember that the advocated architecture for Cordova applications is setting up a Single Page Application (SPA). This way, all of the resources are only loaded once when the app starts, and can stay in the web view for as long as the application is running. In addition, with SPAs, the user will not have page reloads which are not simply typical for native applications. Keeping that in mind, let us set up a simple controller to switch between the two tabs:

var Controller = function() {
    var controller = {
        self: null,
        initialize: function() {
            self = this;
            this.bindEvents();
            self.renderSearchView(); 
        },

        bindEvents: function() {
            $('.tab-button').on('click', this.onTabClick);
        },

        onTabClick: function(e) {
            e.preventDefault();
            if ($(this).hasClass('active')) {
                return;
            }
            
            var tab = $(this).data('tab');
            if (tab === '#add-tab') {
                self.renderPostView();
            } else {
                self.renderSearchView();
            }
        },

        renderPostView: function() {
            $('.tab-button').removeClass('active');
            $('#post-tab-button').addClass('active');

            var $tab = $('#tab-content');
            $tab.empty();
            $("#tab-content").load("./views/post-project-view.html", function(data) {
                $('#tab-content').find('#post-project-form').on('submit', self.postProject);
            }); 
        },
       
        renderSearchView: function() {
            $('.tab-button').removeClass('active');
            $('#search-tab-button').addClass('active');

            var $tab = $('#tab-content');
            $tab.empty();

            var $projectTemplate = null;
            $("#tab-content").load("./views/search-project-view.html", function(data) {
                $projectTemplate = $('.project').remove();
                // Load projects here
            }); 
        }
    }
    controller.initialize();
    return controller;
}

The controller has two methods so far, one for rendering the Search View, and one for rendering the Post Project view. Let’s initialize it in our index.js file by first declaring it at the top and constructing it in the onDeviceReady method:

// top of index.js
var controller = null
// inside onDeviceReady method
controller = new Controller();

Finally, add a script reference in index.html above the reference to index.js. You can download the Search and Post views directly from GitHub. Since the partial views are read from a file, some browsers like Chrome, while trying to render your page, will complain about cross-domain requests.

The possible solution here would be to run a local static server, for example using the node-static npm module. Also, here you can start thinking about using some framework such as PhoneGap and/or Ionic. All of them provide a range of development tools, including emulating in browser, hot reloading, and code generating (scaffolding).

For now, let’s simply deploy to an Android device by running the following:

cordova run android

At this point, your application should have two tabs. The first tab allows projects to be searched:

Apache Cordova application

The second tab allows new projects to be posted:

Apache Cordova project posted

All we have now is a classic web application running inside a web view. We haven’t really used any of the native features so let’s try to do that now. A common question is how to store data locally on the device, or more precisely, what type of storage to use. There are several ways to go:

  • LocalStorage
  • WebSQL
  • IndexedDB
  • Server-side storage accessed through a web service
  • Third-party plugins providing other options

LocalStorage is OK for storing small amounts of data, but it won’t suffice if you are building a data-intensive application, as the available space varies from 3 to 10 MB. IndexedDB may be a better solution for this case. WebSQL is deprecated and not supported on some platforms. Finally, using web services to fetch and modify data fits well within the SPA paradigm, but it breaks down when your application goes offline. PWA techniques along with Service Workers has recently come into Cordova world to help with this.

Also, there are a lot of additional, third-party plugins that come in to fill the gaps in Cordova’s core. The File plugin may be quite useful as it provides you with access to the device’s file system, allowing you to create and store files. For now, let’s try SQLitePlugin which provides you with a local SQLite database. You can add it to your project by running:

cordova plugin add https://github.com/brodysoft/Cordova-SQLitePlugin

SQLitePlugin provides an API to the device’s SQLite database and serves as a true persistence mechanism. We can create a simple Storage Service in the following manner:

SQLiteStorageService = function () {
    var service = {};
    var db = window.sqlitePlugin ?
        window.sqlitePlugin.openDatabase({name: "demo.toptal", location: "default"}) :
        window.openDatabase("demo.toptal", "1.0", "DB para FactAV", 5000000);

    service.initialize = function() {
        // Initialize the database 
        var deferred = $.Deferred();
        db.transaction(function(tx) {
            tx.executeSql(
                'CREATE TABLE IF NOT EXISTS projects ' + 
                '(id integer primary key, name text, company text, description text, latitude real, longitude real)'
            ,[], function(tx, res) {
                tx.executeSql('DELETE FROM projects', [], function(tx, res) {
                    deferred.resolve(service);
                }, function(tx, res) {
                    deferred.reject('Error initializing database');
                });
            }, function(tx, res) {
                deferred.reject('Error initializing database');
            });
        });
        return deferred.promise();
    }

    service.getProjects = function() {
        // fetch projects
    }

    service.addProject = function(name, company, description, addLocation) {
        // add a new project
    }

    return service.initialize();
}

You can download the code for fetching and adding projects from GitHub and paste it in the respective placeholders. Do not forget to add SQLiteStorageService.js to your index.html file above Controller.js, and initialize it in your controller by modifying the Controller’s init function:

initialize: function() {
    self = this;
    new SQLiteStorageService().done(function(service) {
        self.storageService = service;
        self.bindEvents();
        self.renderSearchView();
    }).fail(function(error) {
        alert(error);
    });
}

If you take a glimpse at the service.addProject(), you will notice that it makes a call to the navigator.geolocation.getCurrentPosition() method. Cordova has a geolocation plugin which you can use to get the phone’s current location, and you can even use the navigator.geolocation.watchPosition() method to receive updates when the user’s position changes.

Finally, let’s add the controller event handles for adding and fetching projects from the database:

renderPostView: function() {
    $('.tab-button').removeClass('active');
    $('#post-tab-button').addClass('active');

    var $tab = $('#tab-content');
    $tab.empty();
    $("#tab-content").load("./views/post-project-view.html", function(data) {
        $('#tab-content').find('#post-project-form').on('submit', self.postProject);
    });
},


postProject: function(e) {

    e.preventDefault();
    var name = $('#project-name').val();
    var description = $('#project-description').val();
    var company = $('#company').val();
    var addLocation = $('#include-location').is(':checked');

    if (!name || !description || !company) {
        alert('Please fill in all fields');
        return;
    } else {
        var result = self.storageService.addProject(
            name, company, description, addLocation);

        result.done(function() {
            alert('Project successfully added');
            self.renderSearchView();
        }).fail(function(error) {
            alert(error);
        });
    }
},


renderSearchView: function() {
    $('.tab-button').removeClass('active');
    $('#search-tab-button').addClass('active');

    var $tab = $('#tab-content');
    $tab.empty();

    var $projectTemplate = null;
    $("#tab-content").load("./views/search-project-view.html", function(data) {
        $('#addressSearch').on('click', function() {
            alert('Not implemented');
        });

        $projectTemplate = $('.project').remove();

        var projects = self.storageService.getProjects().done(function(projects) {

            for(var idx in projects) {
                var $div = $projectTemplate.clone();
                var project = projects[idx];

                $div.find('.project-name').text(project.name);
                $div.find('.project-company').text(project.company);
                $div.find('.project-description').text(project.description);

                if (project.location) {
                    var url =
                        '<a target="_blank" href="https://www.google.com.au/maps/preview/@' +
                        project.location.latitude + ',' + project.location.longitude + ',10z">Click to open map</a>';

                    $div.find('.project-location').html(url);
                } else {
                    $div.find('.project-location').text("Not specified");
                }

                $tab.append($div);
            }
        }).fail(function(error) {
            alert(error);
        });
    });
}

To add the console and the dialog plugins, execute the following:

cordova plugin add org.apache.cordova.dialogs
cordova plugin add org.apache.cordova.console

The cordova.console plugin will assist you in debugging by enabling the console.log() function within emulators.

You can easily debug Android applications through the Chrome remote debugger. Once you have connected your device, click the drop down menu in the top right corner (below the X button), expand More Tools, and click Inspect Devices. You should see your device in the list and should be able to open its debug console.

Safari provides the same functionality for debugging iOS apps that run on USB-connected device or emulator. Just enable Developer Tools under the Safari Settings > Advanced tab.

The cordova.dialogs plugin enables native notifications. A common practice is to redefine the windows.alertmethod using the cordova.dialogs API in the following manner:

overrideBrowserAlert: function() {
    if (navigator.notification) { // Override default HTML alert with native dialog
        window.alert = function (message) {
            navigator.notification.alert(
                message,    // message
                null,       // callback
                "Toptal", // title
                'OK'        // buttonName
            );
        };
    }
}

The overrideBrowserAlert function should be called within the deviceready event handler.

You should now be able to add new projects and view existing ones from the database. If you select the checkbox “Include location”, the device will make a call to the Geolocation API and add your current location to the project.

Let us add a finishing touch to the application by setting an icon and a splash screen. Add the following to your config.xml file:

<platform name="android">
    <icon src="www/img/logo.png" />
    <splash src="www/img/logo.png" density="mdpi"/>
    <splash src="www/img/logo.png" density="hdpi"/>
    <splash src="www/img/logo.png" density="xhdpi"/>
</platform>

Finally, place a logo image in the www/img folder.

Check the whole tutorial here

About the author:

Filip is a talented developer with excellent social and communication skills. He focuses on meeting his clients’ demands at every possible level while delivering adaptable solutions and extensively tested code. The client’s priorities are his priorities.

This post was written by Amin Shah Gilani, JavaScript developer for Toptal.

I love building things—what developer doesn’t? I love thinking up solutions to interesting problems, writing implementations, and creating beautiful code. However, what I don’t like is operations. Operations is everything not involved in building great software—everything from setting up servers to getting your code shipped to production.

This is interesting, because as a freelance Ruby on Rails developer, I frequently have to create new web applications and repeat the process of figuring out the DevOps side of things. Fortunately, after creating dozens of applications, I’ve finally settled on a perfect initial deployment pipeline. Unfortunately, not everyone’s got it figured out like I have—eventually, this knowledge led me to take the plunge and document my process.

In this article, I’ll walk you through my perfect pipeline to use at the beginning of your project. With my pipeline, every push is tested, the master branch is deployed to staging with a fresh database dump from production, and versioned tags are deployed to production with back-ups and migrations happening automatically.

Note, since it’s my pipeline, it’s also opinionated and suited to my needs; however, you can feel free to swap out anything you don’t like and replace it with whatever strikes your fancy. For my pipeline, we’ll use:

  • GitLab to host code.
    • Why: My clients prefer their code to remain secret, and GitLab’s free tier is wonderful. Also, integrated free CI is awesome. Thanks GitLab!
    • Alternatives: GitHub, BitBucket, AWS CodeCommit, and many more.
  • GitLab CI to build, test, and deploy our code.
    • Why: It integrates with GitLab and is free!
    • Alternatives: TravisCI, Codeship, CircleCI, DIY with Fabric8, and many more.
  • Heroku to host our app.
    • Why: It works out of the box and is the perfect platform to start off on. You can change this in the future, but not every new app needs to run on a purpose-built Kubernetes cluster. Even Coinbase started off on Heroku.
    • Alternatives: AWS, DigitalOcean, Vultr, DIY with Kubernetes, and many more.

Old-school: Create a Basic App and Deploy It to Heroku

First, let’s recreate a typical application for someone who isn’t using any fancy CI/CD pipelines and just wants to deploy their application.

Diagram of traditional code hosting and deploying actions

It doesn’t matter what kind of app you’re creating, but you will require Yarn or npm. For my example, I’m creating a Ruby on Rails application because it comes with migrations and a CLI, and I already have the configuration written for it. You’re welcome to use any framework or language you prefer, but you’ll need Yarn to do the versioning I do later on. I’m creating a simple CRUD app using only a few commands and no authentication.

 And let’s test if our app is running as expected. I went ahead and created a few posts, just to make sure.
The application running in development

And let’s deploy it to Heroku by pushing our code and running migrations

$ heroku create toptal-pipeline
Creating ⬢ toptal-pipeline... done
https://toptal-pipeline.herokuapp.com/ | https://git.heroku.com/toptal-pipeline.git
$ git push heroku master
Counting objects: 132, done.
...
To https://git.heroku.com/toptal-pipeline.git
 * [new branch]      master -> master
$ heroku run rails db:migrate
Running rails db:migrate on ⬢ toptal-pipeline... up, run.9653 (Free)
...

Finally let’s test it out in production

The application running in production

And that’s it! Typically, this is where most developers leave their operations. In the future, if you make changes, you would have to repeat the deploy and migration steps above. You may even run tests if you’re not running late for dinner. This is great as a starting point, but let’s think about this method a bit more.

Pros

  • Quick to set up.
  • Deployments are easy.

Cons

  • Not DRY: Requires repeating the same steps on every change.
  • Not versioned: “I’m rolling back yesterday’s deployment to last week’s” isn’t very specific three weeks from now.
  • Not bad-code-proof: You know you’re supposed to run tests, but no one’s looking, so you might push it despite the occasional broken test.
  • Not bad-actor-proof: What if a disgruntled developer decides to break your app by pushing code with a message about how you don’t order enough pizzas for your team?
  • Does not scale: Allowing every developer the ability to deploy would give them production level access to the app, violating the Principle of Least Privilege.
  • No staging environment: Errors specific to the production environment won’t show up until production.

The Perfect Initial Deployment Pipeline

I’m going to try something different today: Let’s have a hypothetical conversation. I’m going to give “you” a voice, and we’ll talk about how we can improve this current flow. Go ahead, say something.

Say what? Wait—I can talk?

Yes, that’s what I meant about giving you a voice. How are you?

I’m good. This feels weird

I understand, but just roll with it. Now, let’s talk about our pipeline. What’s the most annoying part about running deployments?

Oh, that’s easy. The amount of time I waste. Have you ever tried pushing to Heroku?

Yeah, watching your dependencies downloading and application being built as part of the git push is horrible!

I know, right? It’s insane. I wish I didn’t have to do that. There’s also the fact that I have to run migrations *after* deployment so I have to watch the show and check to make sure my deployment runs through

Okay, you could actually solve that latter problem by chaining the two commands with &&, like git push heroku master && heroku run rails db:migrate, or just creating a bash script and putting it in your code, but still, great answer, the time and repetition is a real pain.

Yeah, it really sucks

What if I told you you could fix that bit immediately with a CI/CD pipeline?

A what now? What is that?

CI/CD stands for continuous integration (CI) and continuous delivery/deployment (CD). It was fairly tough for me to understand exactly what it was when I was starting out because everyone used vague terms like “amalgamation of development and operations,” but put simply:

  • Continuous Integration: Making sure all your code is merged together in one place. Get your team to use Git and you’ll be using CI.
  • Continuous Delivery: Making sure your code is continuously ready to be shipped. Meaning producing read-to-distribute version of your product quickly.
  • Continuous Deployment: Seamlessly taking the product from continuous delivery and just deploying it to your servers.

Oh, I get it now. It’s about making my app magically deploy to the world!

My favorite article explaining CI/CD is by Atlassian here. This should clear up any questions you have. Anyways, back to the problem.

Yeah, back to that. How do I avoid manual deploys?

Setting Up a CI/CD Pipeline to Deploy on Push to master

What if I told you you could fix that bit immediately with a CI/CD? You can push to your GitLab remote (origin) and a computer will be spawned to straight-up simply push that code of yours to Heroku.

No way!

Yeah way! Let’s jump back into code again.

Diagram of a simple deploy CI/CD pipeline

Create a .gitlab-ci.yml with the following contents, swapping out toptal-pipeline for your Heroku app’s name:

image: ruby:2.4

before_script:
  - >
   : "${HEROKU_EMAIL:?Please set HEROKU_EMAIL in your CI/CD config vars}"
  - >
   : "${HEROKU_AUTH_TOKEN:?Please set HEROKU_AUTH_TOKEN in your CI/CD config vars}"
  - curl https://cli-assets.heroku.com/install-standalone.sh | sh
  - |
    cat >~/.netrc <<EOF
    machine api.heroku.com
      login $HEROKU_EMAIL
      password $HEROKU_AUTH_TOKEN
    machine git.heroku.com
      login $HEROKU_EMAIL
      password $HEROKU_AUTH_TOKEN
    EOF
  - chmod 600 ~/.netrc
  - git config --global user.email "ci@example.com"
  - git config --global user.name "CI/CD"

variables:
  APPNAME_PRODUCTION: toptal-pipeline

deploy_to_production:
  stage: deploy
  environment:
    name: production
    url: https://$APPNAME_PRODUCTION.herokuapp.com/
  script:
    - git remote add heroku https://git.heroku.com/$APPNAME_PRODUCTION.git
    - git push heroku master
    - heroku pg:backups:capture --app $APPNAME_PRODUCTION
    - heroku run rails db:migrate --app $APPNAME_PRODUCTION
  only:
    - master

Push this up, and watch it fail in your project’s Pipelines page. That’s because it’s missing the authentication keys for your Heroku account. Fixing that is fairly straightforward, though. First you’ll need your Heroku API key. Get it from the Manage Account page, and then add the following secret variables in your GitLab repo’s CI/CD settings:

  • HEROKU_EMAIL: The email address you use to sign into Heroku
  • HEROKU_AUTH_KEY: The key you got from Heroku
Image of the secret variables in the GitLab CI/CD settings page

This should result in a working GitLab to Heroku deploying on every push. As to what’s happening:

  • Upon pushing to master
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database is captured in Heroku.
    • Migrations are run.

Already, you can see that not only are you saving time by automating everything to a git push, you’re also creating a backup of your database on every deploy! If anything ever goes wrong, you’ll have a copy of your database to revert back to.

Creating a Staging Environment

But wait, quick question, what happens to your production-specific problems? What if you run into a weird bug because your development environment is too different from production? I once ran into some odd SQLite 3 and PostgreSQL issues when I ran a migration. The specifics elude me, but it’s quite possible.

I strictly use PostgreSQL in development, I never mismatch database engines like that, and I diligently monitor my stack for potential incompatibilities.

Well, that’s tedious work and I applaud your discipline. Personally, I’m much too lazy to do that. However, can you guarantee that level of diligence for all potential future developers, collaborators, or contributors?

Errrr— Yeah, no. You got me there. Other people will mess it up. What’s your point, though?

My point is, you need a staging environment. It’s like production but isn’t. A staging environment is where you rehearse deploying to production and catch all your errors early. My staging environments usually mirror production, and I dump a copy of the production database on staging deploy to ensure no pesky corner cases mess up my migrations. With a staging environment, you can stop treating your users like guinea pigs.

This makes sense! So how do I do this?

Here’s where it gets interesting. I like to deploy master directly to staging.

Wait, isn’t that where we’re deploying production right now?

Yes it is, but now we’ll be deploying to staging instead.

But if master deploys to staging, how do we deploy to production?

By using something you should’ve been doing years ago: Versioning our code and pushing Git tags.

Git tags? Who uses Git tags?! This is beginning to sound like a lot of work.

It sure was, but thankfully, I’ve done all that work already and you can just just dump my code and it’ll work.

Overview of how staging and production deploys will work

First, add a block about the staging deploy to your .gitlab-ci.yml file, I’ve created a new Heroku app called toptal-pipeline-staging:



variables:
  APPNAME_PRODUCTION: toptal-pipeline
  APPNAME_STAGING: toptal-pipeline-staging


deploy_to_staging:
  stage: deploy
  environment:
    name: staging
    url: https://$APPNAME_STAGING.herokuapp.com/
  script:
    - git remote add heroku https://git.heroku.com/$APPNAME_STAGING.git
    - git push heroku master
    - heroku pg:backups:capture --app $APPNAME_PRODUCTION
    - heroku pg:backups:restore `heroku pg:backups:url --app $APPNAME_PRODUCTION` --app $APPNAME_STAGING --confirm $APPNAME_STAGING
    - heroku run rails db:migrate --app $APPNAME_STAGING
  only:
    - master
    - tags

...

Then change the last line of your production block to run on semantically versioned Git tags instead of the master branch:

deploy_to_production:
...
  only:
    - /^v(?'MAJOR'(?:0|(?:[1-9]\d*)))\.(?'MINOR'(?:0|(?:[1-9]\d*)))\.(?'PATCH'(?:0|(?:[1-9]\d*)))(?:-(?'prerelease'[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?(?:\+(?'build'[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?$/
    # semver pattern above is adapted from https://github.com/semver/semver.org/issues/59#issuecomment-57884619

Running this right now will fail because GitLab is smart enough to only allow “protected” branches access to our secret variables. To add version tags, go to your GitLab project’s repository settings page and add v* to protected tags.

Image of the version tag being added to protected tags in the repository settings page

Let’s recap what’s happening now:

  • Upon pushing to master, or pushing a tagged commit
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • The backup is dumped in your staging environment.
    • Migrations are run on the staging database.
  • Upon pushing a semantically version tagged commit
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • Migrations are run on the production database.

Do you feel powerful now? I feel powerful. I remember, the first time I came this far, I called my wife and explained this entire pipeline in excruciating detail. And she’s not even technical. I was super impressed with myself, and you should be too! Great job coming this far!

Testing Every Push

But there’s more, since a computer’s doing stuff for you anyways, it could also run all the things you’re too lazy to do: Tests, linting errors, pretty much anything you want to do, and if any of these fail, they won’t move on to deployment.

I love having this in my pipeline, it makes my code reviews fun. If a merge request gets through all my code-checks, it deserves to be reviewed.

Image of Testing on every push

Add a test block:

test:
  stage: test
  variables:
    POSTGRES_USER: test
    POSTGRES_PASSSWORD: test-password
    POSTGRES_DB: test
    DATABASE_URL: postgres://${POSTGRES_USER}:${POSTGRES_PASSSWORD}@postgres/${POSTGRES_DB}
    RAILS_ENV: test
  services:
    - postgres:alpine
  before_script:
    - curl -sL https://deb.nodesource.com/setup_8.x | bash
    - apt-get update -qq && apt-get install -yqq nodejs libpq-dev
    - curl -o- -L https://yarnpkg.com/install.sh | bash
    - source ~/.bashrc
    - yarn
    - gem install bundler  --no-ri --no-rdoc
    - bundle install -j $(nproc) --path vendor
    - bundle exec rake db:setup RAILS_ENV=test
  script:
    - bundle exec rake spec
    - bundle exec rubocop

Let’s recap what’s happening now:

  • Upon every push, or merge request
    • Ruby and Node are set up in a container.
    • Dependencies are installed.
    • The app is tested.
  • Upon pushing to master, or pushing a tagged commit, and only if all tests pass
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • The backup is dumped in your staging environment.
    • Migrations are run on the staging database.
  • Upon pushing a semantically version tagged commit, and only if all tests pass
    • The Heroku CLI is installed and authenticated in a container.
    • Your code is pushed to Heroku.
    • A backup of your database production is captured in Heroku.
    • Migrations are run on the production database.

Take a step back and marvel at the level of automation you’ve accomplished. From now on, all you have to do is write code and push. Test out your app manually in staging if you feel like it, and when you feel confident enough to push it out to the world, tag it with semantic versioning!

Automatic Semantic Versioning

Yeah, it’s perfect, but there’s something missing. I don’t like looking up the last version of the app and explicitly tagging it. That takes multiple commands and distracts me for a few seconds.

Okay, dude, stop! That’s enough. You’re just over-engineering it now. It works, it’s brilliant, don’t ruin a good thing by going over the top.

Okay, I have a good reason for doing what I’m about to do.

Pray, enlighten me.

I used to be like you. I was happy with this setup, but then I messed up. git tag lists tags in alphabetical order, v0.0.11 is above v0.0.2. I once accidentally tagged a release and continued doing it for about half a dozen releases until I saw my mistake. That’s when I decided to automate this too.

Here we go again

Okay, so, thankfully, we have the power of npm at our disposal, so I found a suitable package: Run yarn add --dev standard-version and add the following to your package.json file:

  "scripts": {
    "release": "standard-version",
    "major": "yarn release --release-as major",
    "minor": "yarn release --release-as minor",
    "patch": "yarn release --release-as patch"
  },

Now you need to do one last thing, configure Git to push tags by default. At the moment, you need to run git push --tags to push a tag up, but automatically doing that on regular git push is as simple as running git config --global push.followTags true.

To use your new pipeline, whenever you want to create a release run:

  • yarn patch for patch releases
  • yarn minor for minor releases
  • yarn major for major releases

If you’re unsure about what the words “major,” “minor,” and “patch” mean, read more about this at the semantic versioning site.

Now that you’ve finally completed your pipeline, let’s recap how to use it!

  • Write code.
  • Commit and push it to test and deploy it to staging.
  • Use yarn patch to tag a patch release.
  • git push to push it out to production.

Summary and Further Steps

I’ve only just scratched the surface of what’s possible with CI/CD pipelines. This is a fairly simplistic example. You can do so much more by swapping out Heroku with Kubernetes. If you decide to use GitLab CI read the yaml docs because there’s so much more you can do by caching files between deploys, or saving artifacts!

Another huge change you could make to this pipeline is introduce external triggers to run the semantic versioning and releasing. Currently, ChatOps is part of their paid plan, and I hope they release it to free plans. But imagine being able to trigger the next image through a single Slack command!

Diagram of a CI/CD deployment pipeline where production deploys are triggered externally, possibly via chat or webhooks

Eventually, as your application starts to grow complex and requires system level dependencies, you may need to use a container. When that happens, check out our guide: Getting Started with Docker: Simplifying Devops .

This example app really is live, and you can find the source code for it here.

This post was written by Jonathan Bethune, Python developer for Toptal.

Those of us who are old enough can remember a day when software was delivered primarily by physical media. The spread of broadband internet and smartphones has led us to the age of the web service software hosted in the cloud accessed by user clients such as browsers and apps.

Not too long ago, web applications were run directly on physical machines in private data centers. For ease of management, these applications were usually monolithic—a single large server would contain all of the back-end code and database. Now, web hosting services like Amazon and the spread of hypervisor technology have changed all of that. Thanks to Amazon Web Services (AWS) and tools like VirtualBox, it has become easy to package an entire OS in a single file.

Using services like EC2, it has become easy to package machine images and string together sets of virtual servers. Along came the microservices paradigm—an approach to software architecture wherein large monolithic apps are broken up into smaller focused services that do one thing well. In general, this approach allows for easier scaling and feature development as bottlenecks are quicker to find and system changes easier to isolate.

Pets to Livestock

I became an infrastructure engineer right at the height of this trend. I recall building my first production environment in Amazon using a series of bash scripts. The servers were like pets to me. I gave each of them cute names. I monitored them carefully. I responded to alerts quickly and kept them healthy. I treated those instances with love and affection because it was painful to try to replace them—much like a beloved pet.

Along came Chef, a configuration management tool, and almost immediately my life got easier. With tools like Chef and Puppet, you can take away most of the manual pain associated with managing a cloud system. You can use its “environments” construct to separate development, staging, and production servers. You can use its “data bags” and “roles” to define configuration parameters and push sets of changes. Now, all of my “pet” servers had graduated from obedience school.

A graphic representation of a crane managing shipping containers

Then in 2013, along came Docker, and a new era began: the age of software as livestock (apologies to any vegans in the audience). The container paradigm is one of orchestration, not configuration management. Tools like Kubernetes, Docker Compose, and Marathon focus on moving around predefined images instead of adjusting config values on running instances. Infrastructure is immutable; when a container goes bad, we don’t try to fix it—we shoot it in the head and replace it. We care more about the health of the herd than individual animals. We don’t give our servers cute names anymore.

The Rewards

Containers make a lot of things easier. They let businesses focus more on their own special sauce. Tech teams can worry less about infrastructure and configuration management and instead worry mostly about app code. Companies can go a step further and use managed services for things like MySQL, Cassandra, Kafka, or Redis so as not to have to deal with the data layer at all. There are several startups offering “plug and play” machine learning services as well to allow companies to do sophisticated analytics without worrying about the infrastructure. These trends have culminated in the serverless model—a software architecture approach that allows teams to release software without managing a single VM or container. AWS services like S3, Lambda, Kinesis, and Dynamo make this possible. So to extend the analogy, we have gone from pets to livestock to some sort of on-demand animal service.

All of this is very cool. It is crazy that we live in a time where a twelve-year-old kid can spin up a sophisticated software system with a few clicks. We should remember that, not very long ago, this was impossible. Just a few US presidents ago, physical media was the standard and only big companies had the means to manufacture and distribute software. Bug fixes were a luxury. Now, that twelve-year-old kid can create an AWS account and make his software available to the entire world. If there’s a bug, someone will bug him on Slack and, in a few minutes, a fix is out for all users.

The Risks

Very, very cool, but not without its price—reliance on cloud providers like Amazon means reliance on big corporations and proprietary technologies. If Richard Stallman and Edward Snowden haven’t made you worry about such things, the recent debacle with Facebook certainly should have.

Greater abstraction away from hardware also brings with it the risk of less transparency and control. When something breaks in a system running hundreds of containers, we have to hope that the failure bubbles up somewhere we can detect. If the problem is with the host operating system or underlying hardware, it might be hard to determine. An outage that could have been resolved in 20 minutes using VMs may take hours or days to resolve with containers if you do not have the right instrumentation.

It isn’t just failures either that we need to worry about when it comes to things like Docker. There is also the problem of security. Whatever container platform we use, we have to trust that there are no backdoors or undisclosed security vulnerabilities. Using open-source platforms is no guarantee of safety either. If we rely on third-party container images for parts of our system, we may be vulnerable.

Wrap Up

The livestock paradigm is attractive for a number of reasons, but it is not without its downsides. Before rushing to containerize the entire stack, tech teams need to think about whether or not it is the right choice and ensure they can mitigate the negative effects.

Personally, I love working with containers. I’m excited to see where things go in the next ten years as new platforms and paradigms arise. However, as a former security consultant, I am cautious enough to know that everything comes with a price. It is up to engineers to remain vigilant to ensure that we don’t give up our autonomy as users and developers. Even the easiest CD/CI workflow in the world would not be worth the cost.

This post was written by Phillip Brennan, Software Developer for Toptal.

Regardless of the apparent evidence to the contrary, programmers are humans. And, as all humans, we like taking advantage over our freedom of choice. Whether that choice is about taking the red pill or the blue pill, wearing a dress or pants, or using one development environment over another, the choice we make places us in one group of people or another. Choice, inevitably, comes after our evaluation of options. And having made a choice, we tend to believe that anyone who chooses differently made a mistake.

You can easily search the internet and find hundreds of debates about Emacs vs Vim. Even if you read them all, it will be impossible to objectively choose a winner. However, does the choice of development environment tell you anything about the quality of work a developer can deliver? Absolutely not!

great developer could write her code into Notepad and still deliver great stuff.

Certainly, there are a lot of things professionals consider when selecting tools for their work. This is true for every profession, including software development. Quite often, however, selection is based on personal taste, not something easily tangible.

Programmers spend most of their time looking at the development environment, so it is natural that we want something pretty as well as functional. Every development environment has its pros and cons. As a whole, they a driving force of the software development industry.

best programming editors

What are the things a developer should evaluate when choosing a set of programming tools like a programming editor of choice? The answer to this question is not as simple as it might sound. Software development is close to an art, and there are quite few “fuzzy” factors that separate a masterpiece from an overpriced collectable.

Every programming language, be it Java, C#, PHP, Python, Ruby, JavaScript, and so on, has its own development practices related to project structure, debugging, and deploying. However, one thing they all have in common is editing code. In this article we will evaluate different development platforms from the perspective of the most common task in software development: writing code.

IDE vs General Purpose Text Editor

An integrated development environment (IDE) (or interactive development environment) is a software application that provides comprehensive facilities to computer programmers for software development. An IDE normally consists of a source code editor, build automation tools, and a debugger, and many support lots of additional plugins and extensions.

Text editors are simpler applications. Compared to IDEs, they usually correspond to just the code editor segment of an IDE. However, they are often much more than that. IDEs are created to serve the purpose of software development, while many text editors are designed to be used by non-developers as well.

Static-typed languages can get a lot of benefits from IDEs. Because of the strict typing rules, it is possible for the IDE to detect bugs and naming inconsistencies across classes and modules, and even across files, directly in the editor, before compiling. This functionality comes standard with many IDEs, and for that reason, IDEs are very popular for static-typed languages.

However, it is impossible to do the same thing for dynamically typed languages. For example, if a method name may be generated by the code itself, constructed from a series of string concats, trying to detect naming errors in dynamic languages requires nothing less than running the actual program. Because one of the major benefits of IDEs does not apply to dynamic language programmers, they have a greater tendency to stick with text editors like Sublime. As a side note, this is also a major reason why the test-driven development movement has grown up around dynamic language communities, and has not had as strong of a following in static languages.

What Makes a Great Programming Editor?

Aside from a number of different features for various languages, every programming editor needs to have a well-organized and clean user interface. Overall aesthetic appeal should not be overlooked, either. It is not just a matter of looking good, as a well-designed editor with the right choice of font and colors helps keep eyestrain down and lets you be more productive.

In today’s development environment, a steep learning curve is a liability, regardless of feature set. Time is always valuable, so a good editor should be easy to get used to. Ideally, the programmer should be able to start work immediately, without having to jump through too many hoops. A Swiss army knife is a practical and useful tool, yet anyone can master it in minutes. Likewise, for programming editors, simplicity is a virtue.

User Interface, Features, and Workflow

Let’s take a closer look at UI, different features and capabilities, and frequently used tools that should be a part of any programming editor.

Line numbers, of course, should be on by default and simple to turn on or off.

Snippets are useful for inserting standardized blocks of text in a fixed layout. However, programming is a lot about saying things just once, so be careful with snippets as they might make your code hard to maintain in the future.

The ability to lint, or syntax-check, the current file is useful, as is the ability to launch it. Without this facility, a programmer must switch to an external command line window, choose and run the correct command, and then step through error messages to find the source of the error. However, the linting must be under the programmer’s control, because the delay incurred by the lint might interrupt the coder at a crucial moment.

inline doc

Inline doc is useful as long as it does not get in the way, but having a browser page open on the class definitions is sometimes more useful, especially when there are lots of related classes that do not directly extend each other. It is easy enough to cut and paste code from the browser documentation to the code being written, so the additional complexity of inline documentation often becomes less useful, indeed, more annoying, as the programmer’s knowledge of the documentation increases.

Word-completion is helpful since it is fast, and almost as reliable as in-edit documentation, while being less intrusive. It is satisfying to enter just a few characters of a word and then hit enter to get the rest. Otherwise, one labors under the strain of excess typing, abhorred by lazy programmers, who want to type ee rather than the more lengthy exponentialFunctionSquared. Word completion satisfies by minimizing typing, enforcing coherent naming and by not getting in the way.

Renaming variables and functions across the program is useful, but you need to be able to review changes and make sure your code is not broken. Again, word completion is a useful halfway house, in that it works for all languages; you can use long names for items that have long lifetimes, without incurring a typing overhead. You can use references to them via a shorter name locally, in order to shorten expressions which might otherwise spread over too many lines. If you need to rename, the long names are unique, so this approach works across all languages, and all files.

Source files can sometimes grow a lot. Code-folding is a nice feature that simplifies reading through long files.

Find/change with scope limitation to local, incremental, or global with meta characters and regular expressions are part of the minimum requirement these days, as is syntax highlighting.

Over the years, I went through a number of editors, and this is what I think of them:

  • Emacs: One of the most popular editors in the world. Emacs’ greatest feature is its extensibility, despite the complexity of its extension language (you can even play Tetris in it with M-x tetris). Emacs fans consider its terminal-based interface to be a great feature, while others might debate that it’s a drawback. In my personal experience, I found it too much to adopt and learn. I am sure that if you know how to use Emacs you will never use anything else, but to take on and learn the entire culture was more than I wanted to do. Nevertheless, its popularity among developers proves that it is far from being a relic of the old times, and remains part of our future as well.
  • Vi/Vim: Vim is another powerful terminal-based editor, and it comes standard with most xNIX operating systems. Apart from having a different interface than Emacs, my view is practically the same. If you grew up on it, I am sure you will never use anything else. Having Vi skills will make your life much simpler when operating through SSH and other tight spots, and you wont have problems with speed, once you get familiar with keystrokes. While not as tough to crack into as Emacs, the learning curve is still quite steep, and it could definitely use few nice features of a windowed editor.
  • SublimeText: True to its name, SublimeText is a beautiful text editor with tons of features. Unlike some similar editors, SublimeText is closed source, so it cannot be modified at a low level. SublimeText offers the simplicity of traditional text editors, with a lean and fast UI. Many developers find it easier to use than Vim, and this is especially true of newcomers. The learning curve just isn’t as steep. While the UI is minimal and straightforward, SublimeText does offer a few nifty features, such as a scaled down display code on the right of the UI, allowing users to quickly scroll through their code and navigate with relative ease. While it’s not completely free, the feature-limited demo version is. Unlocking all the features will cost you $70.
  • Atom is the result of a GitHub effort to produce a programming editor for a new generation of developers. While it is still a work in progress, Atom is a very capable editor with a vibrant community of developers keen on new extensions, JavaScript libraries and more. It’s downsides include some UI quirks, the possibility that some add-on packages could misbehave, and reported performance issues when working with (very) big files. But the project is under active development, and current shortcomings are likely to be improved. Atom is an open source project, and it can easily be hacked to suit your needs.
  • Nano: Excellent in a tight corner, but not feature-rich enough to prevent the inevitable thought creeping into one’s mind that there must be faster way to do this as one struggles through the keystrokes to indent a block of code, while keeping the comments lined up in column 80! It does not even have text highlighting, and should not be used for anything more than config file changes.
  • TextMate2: TextMate’s biggest drawback is that it only runs on Mac. As its creators put it, “TextMate brings Apple’s approach to operating systems into the world of text editors.” By bridging UNIX underpinnings and GUI, TextMate cherry-picks the best of both worlds, to the benefit of expert scripters and novice users alike. It is the editor of choice for many Ruby, Python, and JavaScript developers, with great support for Bash or Markdown as well. At the moment of publishing this article TextMate 2 is still in Beta, but it already has a very mature plugin ecosystem that promises to extend it even beyond Emacs’s extensions.
  • jEdit: Java-based, and considered slow by some. Out of the box configuration might push certain people away, but jEdit can be extremely fast if configured properly, as well as extremely nice looking.
  • Eclipse: Another widely used IDE, Eclipse is very popular among Java developers, but has been adapted to many different platforms. We could argue that its monolithic architecture is a rock that will pull it under the water, but it is still one of the most popular platforms among developers.
  • Aptana Studio: A comprehensive open-source web application IDE. It is available as an Eclipse plugin, which makes it popular among some Java developers. The standalone version is even leaner, and offers a range of different themes and customization options. Aptana’s project management capabilities may also come in handy to coders who honed their skills in Eclipse. While earlier versions suffered from performance issues on some hardware platforms, these problems were addressed in Aptana Studio 3, and should be a thing of the past.
  • NetBeans: Another relatively popular open-source IDE with cross-platform support. It is somewhat slower on startup than lean editors like SublimeText, and the choice of add-ons is limited compared to some alternatives. Many Java developers have grown to love NetBeans thanks to seamless SCM integration and HTML5 support. NetBeans support for PHP has also improved in the latest releases.
  • JetBrains: Offers a family of IDEs for Java, Ruby, Python and PHP. They are all based on the same core engine. Very capable in its own right, JetBrains IDEs have been gaining a growing following. However, they are not free, open-source solutions, although a 30-day trial is available, and pricing is reasonable.
  • Komodo Edit: Komodo Edit has great potential, and yet it’s full of annoying little “gotchas” and idiosyncrasies that can be frustrating by its lack of orthogonality. Komodo Edit feels cluttered, which is a shame because it clearly has immense capability. I keep going back to Komodo Edit in the hopes that I have missed some organizing principle, and every time, I am beaten back by a welter of disorganized capability.
  • Geany: Geany is not a major power player like many of the other editors in this list. It is defined more by “what it is not” than “what it is.” It is not slow, it does not have a lot of heritage from the old days, it does not have a macro capability, or much of a multi window on buffer capability. Yet the things it does do, it does well enough. It is, perhaps, the least demanding of all the editors that I tried and still can do 90 percent of what you would expect from a programmer’s editor. Geany looks good enough on Ubuntu, which is one of the reason I chose it as my preferred editor.

My Conclusion

It would be presumptuous to declare just one as the best programming editor among these great tools. And there are quite a few editors I did not even try. There is no one-size-fits-all solution. This is what compelled me to try out a number of different editors.

I currently am using Geany, but it’s because it fits the requirements I have. With Geany, and a lot of help from Perl/Gimp/Audacity/Sox, I am able to develop and maintain the Java code-base for the Android apps I develop, prepare them for compilation in different configurations for multiple distributors, source, lint, compile, dex and produce .apk files, and deliver these apps globally.

Your line of development might set a different set of requirements, and I hope I saved you some time in researching for the most appropriate programming editors.