Gap Buffers

I recently mentioned a personal project I’m working on to create a text editor, which I’m calling nabu. Tonight, I thought I’d take some time off from writing code to talk a bit about the data structure I’m using for the nabu character buffer: the gap buffer.

At its core, a character buffer is just a mutable list of characters. A list is an abstract data structure that holds an indexed set of elements. A mutable list can be modified, so you can replace any element with a new element, as well as add and remove elements at any index. A key feature of a list is random access, meaning you can access any element from the set by using its index.

There are lots of ways to implement a list data structure. The most straight forward is an array list, where a sequence of consecutive memory locations are used to store the elements, one element per location. An array list is really simple, and really memory efficient. It’s really fast to iterate over, great for random access, and modifying existing elements is extremely quick. However, inserting and deleting elements is generally quite slow. In the worst case for an insert, you need to reallocate the entire buffer (in order to get enough space for the new element), which involves copying every existing element to the new location, and then copy every element again to move them out of the way for the new element. Deletion is similar, though doesn’t usually require reallocation.

Another implementation option is a linked list, where the elements are stored in a set of nodes. Each node stores a single element as well as a pointer (reference) to the next node in the list. Linked lists are pretty efficient at element modification, insertion, and deletion. They’re relatively quick at iterating, but terrible at random access (in its simplest implementation, a linked list requires iterating all the way from the beginning to the element in question). They can also require a significant amount of memory overhead; in the case of a character buffer, a linked list can easily double the memory requirements, or worse (e.g., adding a 32-bit pointer for each 32-bit Unicode character).

Insertion and deletion are critically important for a text editor, so an array list would be a costly choice of implementation. A linked list would be better in this regard, but the memory overhead would be significant, as would the slowness of random access (imagine moving the cursor from the beginning of a document to the end by iterating one character at a time).

A third way to implement a list is called a gap buffer. Structurally, it’s very much like an array list but with a simple modification that makes insertion and deletion much cheaper. For this reason, gap buffers are commonly found in text editor implementations, including my own implementation in nabu.

So what is a gap buffer? Simply put, it’s an array list with a gap in it. Imagine you have an array list but you know you’re going to be appending some elements to it. You can pre-allocate additional space at the end of the array for the future elements. Conceptually, the buffer now has a gap at the end that can be filled in with added elements.

Continue reading “Gap Buffers”

Gap Buffers

Dead Simple Observables in Python

https://github.com/mearns/pyDso

This is a really quick python module I banged out one evening while watching “Astronaut Wives”. It implements some basics of observables and reactive programming principles. It’s not meant as a serious library for you to use, more of a way to show how simple observables really are.

For a great introduction to reactive programming, check out The introduction to Reactive Programming you’ve been missing, by Andre Staltz.

If you want an observable library to actually use, you can check out Reactive-Extensions, including RxPy for python.

Status

Why wouldn’t I create my own text editor?

I love Vim, except now that I’ve used Atom, I hate Vim. Not because I like Atom, I hate Atom, too. But it’s so much prettier than Vim!

So I’m no longer satisfied with any text editor that I’ve tried. So obviously, I’ll just create my own.

Continue reading “Why wouldn’t I create my own text editor?”

Status

Code Reviews (FTW!)

Code reviews are so unbelievably important for a team, I can’t believe how many teams don’t do them. And it isn’t just the development team, its any team that writes code. Dev, QA, Database, Ops, everyone! If your customer service team is writing macros in Excel, they should be code reviewed!

Let me just start by acknowledging that even a thorough code review won’t catch every issue. Code reviews are pretty good at catching certain types of issues. Silly errors like using an assignment operator instead of an equality operator, or a strict comparator instead of non-strict. These are mistakes that are easy to make, and can often have very subtle consequences that are hard for your tests to catch. Fortunately, they also tend to be the types of errors that stand out to code reviewers (probably because they are simple and everyone’s looking to score points on the review).

Continue reading “Code Reviews (FTW!)”

Code Reviews (FTW!)