Home

A Seriously Hacky Editor

Estimated Reading Time: 5.74 minutes.

A little while ago, I made a truly horrible abomination of an editor, called selfedit.

When I say this is an abomination, I mean it.

Documents created with it, if emailed, will usually be destroyed by the email provider. Browser support is complicated, but not great.

It took me approximately fifteen minutes total to throw the whole thing together.

Now that I've hopefully drilled into your head that this is a bad idea, lets drill into it!


Base Components

<div id="c" contenteditable></div>
<a id="p"></a>
<script>...</script>
<style>
#p {
  display: block;
  margin: 0 auto;
  padding: 1.5em;
  text-align: center;
}

#c {
  width: 80%;
  padding: 1em;
  margin: 0 auto;
  box-shadow: 0px 0px 5px 0px rgba(0,0,0,0.75);
  border-radius: 0.2em;
  line-height:1.5;font-size: 1em;
}

pre {
  font-family:Consolas,Monaco,Lucida Console,Liberation Mono,DejaVu Sans Mono,Bitstream Vera Sans Mono,Courier New, monospace;
  padding-left: 1.5em;
}

.quote {
  padding-left: 0.5em;
  border-left: 6px solid #ccc;
  background-color: #f1f1f1;

}
</style>

There are exactly four parts to selfedit.

  1. Somewhere we can put our content. Our content-editable div.

  2. Somewhere we can put our payload. This link will contain both any content, and the editor itself. Thus, having the link is enough to reproduce the document. No server required.

  3. Our script. We'll go into this for the majority of this post, later.

  4. Some basic styling. Editors need decent styles. A few simple rules and we end up with something half decent.


Self-Hosted Link

It may come as a surprise to some readers that a link can, in fact, contain the entirety of its own content. This is via something called data URIs.

A data URI can take a lot of forms, but the one we're interested in is:

data:text/html;base64,$PAYLOAD

We tell the browser the link contains valid HTML, which is encoded in base64 (so that we can stick non-URL-safe characters into the link), and then $PAYLOAD contains the encoded values.

Now, because data URIs are so unsafe, most email providers strip them if you try and send them, even in plain text. Browsers can prevent you copying them to the clipboard in some cases. A data URI containing HTML should pretty much never be trusted.

(There are safer data URIs. Like image files. But, that's another discussion.)

Constructing this link is actually extremely trivial for us:

var p = document.getElementById('p');

...

p.textContent = 'Copy me';
p.href = "data:text/html;base64," + btoa(c.parentElement.innerHTML);

Voila! We now have a link that contains the page itself.

The harder part will be creating an editor with content-editable, which is rather fiddly. But, essential if we want to make this project even interesting enough to play around with.


Editor

var c = document.getElementById('c');
var p = document.getElementById('p');

window.addEventListener('load', function() {
    c.addEventListener('input', function(event) {
        p.textContent = '';
        p.href = '';

        if(event.inputType != 'insertText' && event.inputType != 'deleteContentBackward') {
            ...
        }
    }, false);
});

Turns out running input events without causing lag is actually somewhat difficult, strangely enough. We settled for basically running our editor scripts when the user hits the return key. Some other events also trigger it, but it means it runs infrequently enough that we won't kill performance.

Next, we need to manipulate the content of the div so that our mini-markup will work and the user gets more than plaintext.

var children = c.querySelectorAll('*:not(br):not(pre)');
for(var i = 0; i < children.length; i++) {
    ...
}

That query selector is magic, and extremely useful. The addition of querySelectorAll is a godsend for this kind of work. In this case, we're grabbing every element that is not br (line endings) or pre (code blocks) from the content-editable div.

Titles

if(children[i].textContent[0] == '#') {
    children[i].innerHTML = '<div><h1>' + children[i].textContent.slice(1) + '</h1></div>';
}

Titles are dead simple to implement. They start with an expected character, which we'll be stripping away.

List Items

if(children[i].textContent[0] == '+') {
    children[i].innerHTML = '<div><li>' + children[i].textContent.slice(1) + '</li></div>';
}

List items are a little bit more complex. We're creating a bunch of li elements, but there's no containing ul or ol. Thankfully, even though this is a clear case of being badly behaved markup, most browsers are flexible enough to make it render correctly all the same.

Code Blocks

if(children[i].textContent.slice(0, 4).trim() === "") {
    children[i].innerHTML = '<div><pre>' + children[i].innerText + '</pre></div>';
}

Code blocks start with four characters of whitespace, and create a pre, which our parser/hack doesn't touch. For the most part, they should work quite well, though you have to have a new one for each line.

Quotes

if(children[i].textContent[0] == '>') {
    children[i].classList.add('quote');
} else {
    children[i].classList.remove('quote');
}

Quotes are bit more fiddly. This strange if/else is because when the user creates a new element in the content editable, it generally inherits the previous set of classes. Which means they would get stuck inside the quote block without it.

Horizontal rule

if(children[i].textContent.trim() == '---') {
    children[i].innerHTML = '<hr>';
}

Dead simple. Turning a triple dashed line into an actual line is a yawn.

Links

links = children[i].textContent.match(/[([wsd]+)](((?:/|https?://)[wd./?=#]+))/);
if(!!links && links.length > 2) {
    children[i].innerHTML = children[i].innerHTML
        .replace(links[0], '<a href="' + links[2] + '">' + links[1] + '</a>');
}

There is nothing simple about creating a link, unfortunately. This creates links from the Markdown-like [Some text](https://example.com) syntax. Its a gross regex which sort of works, most of the time. If there isn't a single bug in this, I'll eat my hat.

Bold Text

bold = children[i].textContent.match(/**(.+)**/);
if(!!bold && bold.length > 1) {
    children[i].innerHTML = children[i].innerHTML
        .replace(bold[0], "<strong>" + bold[1] + "</strong>");
}

More gross regex! And this time, there is a bug. It doesn't exactly work if you try and have more than one bold statement per paragraph.

Italic Text

italics = children[i].textContent.match(/*(.+)*/);
if(!!italics && italics.length > 1) {
    children[i].innerHTML = children[i].innerHTML
        .replace(italics[0], "<em>" + italics[1] + "</em>");
}

The same bug from bold text applies, because we're following the same pattern.

Underline Text

underscore = children[i].textContent.match(/_(.+)_/);
if(!!underscore && underscore.length > 1) {
    children[i].innerHTML = children[i].innerHTML
        .replace(underscore[0], "<u>" + underscore[1] + "</u>");
}

The same bug from bold text applies, because we're following the same pattern.

Strikethrough Text

strike = children[i].textContent.match(/~(.+)~/);
if(!!strike && strike.length > 1) {
    children[i].innerHTML = children[i].innerHTML
        .replace(strike[0], "<s>" + strike[1] + "</s>");
}

The same bug from bold text applies, because we're following the same pattern.

Remove Formatting

Now, the nature of content-editable makes it somewhat easy to add formatting, but it makes it a pain in the butt to remove it.

So we need to add a way to do that. Which means another loop, but much smaller:

var children = c.querySelectorAll('*:not(br)');
for(var i = 0; i < children.length; i++) {
    if(children[i].tagName != 'DIV') {
        // Strip formatting, plz.
        if(children[i].innerText.trim()[0] == '`') {
            var d = document.createElement('div');
            d.innerText = children[i].innerText.trim().slice(1);
            children[i].parentNode.replaceChild(d, children[i]);
        }
    }
}

Once the formatting is done, is when you construct the self-hosted link above.


The Result

The result of this hack is two fold.

  1. This single HTML file, which contains our code and is the editor. Think of it as a bootstrapping file. You only need access to this the once, to be able to create a document link.

  2. Truly evil links like this one, that contain the content and editor. You don't need access to the original file. The link is both content and editor, and can be shared (unsafely).

(If you are surprised clicking the link isn't doing anything - data URIs can be unsafe. Your browser is trying to protect you. You can copy/paste it into the URL bar. Or better yet, cut out the base64 content and decode it to make sure it's safe first!)


Comments

Submit comment...

Subscribe to this comment thread.