Dan Jutan – DH LAB https://dhlab.lmc.gatech.edu The Digital Humanities Lab at Georgia Tech Tue, 02 Mar 2021 14:33:09 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 41053961 Data by Design: Automating the Chapter Timeline https://dhlab.lmc.gatech.edu/databydesign/data-by-design-automating-the-chapter-timeline/ https://dhlab.lmc.gatech.edu/databydesign/data-by-design-automating-the-chapter-timeline/#respond Tue, 02 Mar 2021 14:31:28 +0000 https://dhlab.lmc.gatech.edu/?p=1192 Each chapter of Data by Design has a chapter timeline, which places images and visualizations on a vertical minimap, allowing the user to get a sense of the chapter and their progress within it at a glance. Additionally, all user highlights will be visualized on the timeline as they’re made. The map will allow the user to click objects within it and be taken to the corresponding spot in the chapter. Embracing the notion of a “meta-visualization,” each chapter’s timeline is styled after the main visualization in that chapter.

The Peabody timeline at time of writing

 

Initially, the chapter timeline rendered manually-entered metadata, a process that was hard to maintain and required an additional source of truth—whenever the structure of a chapter changed, we’d have to manually update the timeline data. Instead, I built an automated system that looks at the chapter content as it is to build the metadata that is given to the timeline. In this post, I’d like to an overview of how that works.

Compile-time or Runtime?

All of the chapter content is written and stored before it ever appears in your web browser. So should our chapter metadata system analyze the chapter content files or the rendered chapter content as it appears in your web browser? The former would happen at “compile-time”—during development, metadata would be generated by some sort of script, and the output of that script would be saved and uploaded along with the rest of the project to be referenced by the chapter timeline. The latter would involve an ad-hoc analysis of the chapter after it is rendered by the browser; metadata would be generated on the fly.

Both approaches have their advantages—the compile-time system would be more verifiable, more easily allow for manual tweaks, and would result in a slightly faster loading time for the user—but I went with a runtime approach. This is for two main reasons. First, I wanted a component’s exact vertical placement in the page to be part of this analysis, so that we could send the user to the right spot in the chapter when a square is clicked in the timeline. This exact rendering information is only available after all of the components have been rendered in the browser. Second, I wanted to be able to account for changes in the chapter structure that happened at runtime. For example, let’s say we had a chapter that dynamically loads an image only if the user clicks a certain button; I would want that new image to show up in the timeline then and only then.

Sections 

In the screenshot above, the start of a new section is represented by a big square, and the length of the line to follow corresponds with the “length” of the section, with spots reserved for paragraphs, images, and visualizations (“subsections”). So the first order of business when building this timeline is identifying the sections in the chapter. My first try at this was to analyze the DOM directly and find the header elements, but we ended up building a reusable Section component so that the style of all the section headers could be changed at once. Once that component was built out, it made more sense to have the Section component register itself rather than for the chapter timeline to go looking for it. But what was to receive and manage that registration?

The Source of Truth

We need a place to keep track of what we’ve found once we’ve analyzed the chapter. Our project uses Vuex, a Vue-integrated state management system, to manage the data needed by chapter visualizations, so I created a Vuex store to track the necessary information for the chapter timeline. When a Section component is mounted onto the page, it tells this store that it’s been created and passes in the id of the section element. Then, the store calls a function that analyzes the children of the section element to determine how much space to reserve for subsections. Let’s check out an example: the first section of the Peabody chapter. When it first gets registered, the store analyzes its DOM children, and finds four elements:

This is why you see four slots for squares on the timeline for this section. The next step is to take a look at these four elements and see what we know. One of the children—at index 3—has the special predefined “IMG” class, so we can go ahead and register an image at that subsection, which will show up as a green square in the timeline. We thus pWe don’t know anything inherently about the div at index 1 – is it a visualization, a scrollytell, or something that shouldn’t be put on the timeline? For elements like these, we again rely on the component to register itself with the store so that we can be confident about its identity. The div at index 1 in this example is the Map Scroller visualization. As I mentioned in my last post, visualizations all have a mounted hook that registers them when they get loaded into the page:

This actually happens before the section itself is registered, because in Vue, subcomponents’ mounted hooks are called before their parents. So by the time we do our section-child analysis, we’ve already been told to expect a visualization that looks like this element. So we’ll go ahead and save that information – a visualization square (orange) should be rendered at the second subsection slot of the first section.

The store allows us to have all the data required for the timeline in one, trackable place and decoupled from the timeline rendering itself. Thanks to Vuex’s debugging tools, which log every mutation to the store, it was easy to find bugs during development. And the store is decoupled in the sense that it doesn’t need to know any rendering details like the size or color of squares. It simply maintains the body of data needed for rendering, which is later passed to the Navline component for rendering.

Paragraphs and Highlights

Paragraphs don’t themselves get a square on the timeline: instead, they’re represented by whitespace. However, when the user highlights something in that paragraph (and drags it into their notebook), an indicator shows up next to that paragraph slot:

Similarly, when the user highlights a caption of an image or text inside a visualization or scrollytell, a highlight indicator shows up beside that:

As you might guess, this follows the same registration pattern I’ve described: when a highlight is created, it registers itself with the store, which tracks down its parent section and figures out which subsection slot it should be in.

You might notice that the Peabody chapter has two paragraphs before the Map Scroller (at least at time of writing), but the timeline only shows one slot. In fact, a highlight indicator will show up in that slot regardless of whether you highlight in the first paragraph or the second. This was a deliberate design decision to combine adjacent paragraphs into one paragraph slot, keeping the timeline from getting too long.

Takeaways

There’s always a tradeoff when automating something: you want to make sure that the time and effort spent in the automation process really does save you time in the long run. In the case of this chapter timeline, this automation process drastically speeds up future chapter development: we only have to worry about following certain registration patterns—most of which are codified in reusable components and mixins—and the timeline will be functional right off the bat. It also makes maintenance far easier: updating the timeline is the same process as updating the chapter itself. Lastly, it guarantees that the timeline always works the same way in each chapter; if we made them by hand, we’d have to consciously make sure we were consistent.

All in all, this feature was incredibly satisfying to build. I love constructing these process-improving systems, and there are a few more to talk about in future posts, but for now—thanks for reading!

]]>
https://dhlab.lmc.gatech.edu/databydesign/data-by-design-automating-the-chapter-timeline/feed/ 0 1192
Data by Design: Code Reuse and Visualizations https://dhlab.lmc.gatech.edu/process/data-by-design-code-reuse-and-visualizations/ https://dhlab.lmc.gatech.edu/process/data-by-design-code-reuse-and-visualizations/#comments Tue, 09 Feb 2021 02:08:06 +0000 https://dhlab.lmc.gatech.edu/?p=1150 When I joined, the team had done a considerable amount of research, design, and requirements exploration in addition to our small prototype. One of the ideas continually emphasized was that the book would be heavily data-driven. This can be a bit of a buzzword and a swiss-army-knife of a term – the first two D’s in the popular web visualization tool D3 are “data-driven,” for a technical example, while data-driven can characterize approaches in other disciplines, like data journalism.

Data by Design is situated in a broad meaning of the category – we’re data-driven in the sense that the scope of the book is driven by contours in the history of data, and the research follows data and the people that wield it. But in addition to researching data and using data for research, we make our argument by presenting data, rendering it visually throughout the book. These include recreations of historical data visualizations and new takes on them: new visualizations of old data and historical visualizations swapped out with new data. So we’re also data-driven in the technical sense: the book is filled with interactive web components that render and depend upon collections of data.

That characterizes a common task throughout development: in every chapter, we need to build components that take some dataset and present it visually. Software engineering is all about abstraction and generalization—if there’s a repeated task, find a way not to repeat yourself—and while this might not sound like a lot of repetition (after all, there’s a big difference between the visualizations of Peabody and Playfair), from the development side, the visualizations have much in common. They each do all or many of the following:

  • Require a data source to be loaded
  • Reformat that data in some way
  • Contain subcomponents that also need to see the data
  • Respond to events (think of these as messages) from subcomponents and send events back out to the chapter
  • Mutate the data (after user interaction)
  • Allow the user to drag it into the notebook
  • Save the mutated data to the notebook server when dragged into the notebook
  • Show up in the chapter timeline

This adds up to a sizeable chunk of functionality that can be shared across visualization components so we don’t have to reimplement these features in every new visualization.

In object-oriented programming, there are two common ways of sharing groups of functionality among objects: composition and inheritance. With composition, the objects each have a copy of an object that does the desired functionality; with inheritance, the objects declare themselves as being a version of the object that does the desired functionality, and thus “inheriting” its functionality, thanks to language features like subclassing. In this example, if we use composition, the visualization components would each create an instance of some helper object that has the features implemented, and it would call the helper’s methods when necessary.

A diagram of the Peabody Chart and Playfair Graph related to a VisualizationHelper class through composition

If we use inheritance, we’d declare each visualization component as being a Visualization, which would be thought of as a “parent” object, and then our visualization components would automatically contain all the functionality that was in the parent. (In many OOP languages these are called “superclasses,” or “abstract superclasses” when the parent is simply a template that depends on the child to flesh it out.)

A diagram of the Peabody Chart and Playfair Graph related to a Visualization class through inheritance

Ever since the initial prototypes, Data by Design has used the JavaScript framework Vue.js. When I say “component,” I refer to the building blocks of a Vue application (somewhat analogous to “class” which is the building block of many OOP languages). I won’t go into all of the elements in a component, but they include:

  • data, which are reactive properties that can be referenced and updated throughout the component and its UI. For example, the interactive Peabody grid uses a data field to keep track of and respond to the current pixel that the user is hovering over.
  • props, which are like data, but they’re passed from the parent component to the child and can’t be changed by the child. That grid takes a “century” prop so it knows what year the first square of the grid represents.
  • methods, pieces of functionality that it can call and reuse. The Peabody grid calls one of its methods every time the user hovers their mouse over it.
  • lifecycle hooks, which set up the component to trigger pieces of functionality when something happens in the program. For example, a component might use the mounted lifecycle hook, which triggers when the component is added to the page, to register itself in the chapter timeline when it first becomes visible.

The newest release, Vue 3, ships with a Composition API to make it as easy as possible to share code across components using composition. OOP design patterns tend to favor composition over inheritance: composition can be easier to maintain, it avoids the rigid, predetermined taxonomy required by inheritance, it always allows for multiple helper objects (inheritance frowns upon multiple parent objects), and it even allows for techniques that swap out the helper object for another at runtime.

The Composition API brings a whole suite of code-sharing features to Vue that I’ve enjoyed using in newer projects, but most current Vue projects, Data by Design included, are built on Vue 2. A primary way to share functionality among components in Vue 2 is its mixin feature, which is similar to inheritance. Instead of building a “parent” object, you build a mixin object. This is written like any other Vue component and can have all the features that a component can, but it can’t be used directly. Instead, a component can declare that it is using that mixin (or any number of mixins), in which case all the data, methods, and hooks in the mixin are merged with (or “mixed into”) the component.

This lets us do something that composition often can’t: completely move functionality out of the way of the child components. When our team builds a new component, I don’t want us to have to think about registering it in the timeline or making it draggable or figuring out how to send data to subcomponents: if it isn’t unique to the component, I want it to be done automatically. With composition, you’d still have to explicitly configure the appropriate hooks, even if that configuration is a single call to a helper object. This is preferred in many applications: with composition, you can involve functionality as needed, without having to include things that aren’t. But in this case, aside from prioritizing an easy developer experience, we do want to enforce all of the functionality. In other words, we want to make sure that every Visualization does certain things and has certain capabilities. It’s a case where inheritance is indeed desired over composition.

The mixin is imported and added to a component using the mixins option in Vue:

And then the visualization automatically gains all of the functionality in the mixin. Any props that that are specified in the Visualization mixin are now accepted by the child component, any data and methods created in the mixin are accessible, and its lifecycle hooks are registered.

So for the Visualization mixin:

  • The props allow the component to take the name of a static dataset (which is to be grabbed from the server) and/or a mutable dataset (which is to later be sent to the server as part of the notebook). There’s also a width prop, which allows the component to base the size of visualizations on a maximum width that’s determined by the chapter.
  • The hooks make sure that when the component is created, the specified datasets are downloaded or registered, and when the component is mounted (i.e., when it appears in the page) it creates a draggable icon in the corner for dragging the visualization into the notebook and lets the chapter timeline know.
  • The data (really, computed properties) allow the child to view the dataset and various details about how it is registered.
  • The various helper methods allow the component to transform the data, easily register events, and create lengths based off of the passed-in width.)
  • Additionally, many of the properties and methods are set up to be injected further down the component tree. That means that they aren’t just accessible by the component that’s been directly “mixed” with the mixin; any component contained by it can request them.

Many of the Visualization mixin’s roles rely on its communication with various Vuex state modules. In other words, the Visualization mixin doesn’t keep track of all the visualizations and all the data itself; its job is to coordinate with those systems. I’ll talk about how these modules work in a future post, as this one’s getting long enough!

You can view the full code here.

]]>
https://dhlab.lmc.gatech.edu/process/data-by-design-code-reuse-and-visualizations/feed/ 1 1150
The Data by Design Notebook: Drag and Drop https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-drag-and-drop/ https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-drag-and-drop/#respond Thu, 16 Jul 2020 02:39:40 +0000 https://dhlab.lmc.gatech.edu/?p=1081 In my last post, I introduced Data by Design’s notebook feature, and spoke a bit about the design and implementation of the chapter highlighting capability. Highlighting, though, is only the first step in this user story, and we wanted a drag-and-drop interface that felt simple and natural to move highlights to the notebook. In this post, I want to outline how I integrated drag-and-drop throughout DxD’s UI.

Why Drag and Drop?

Although we all intuitively felt that we wanted drag-and-drop for this interface, I want to parse that here. Why does drag and drop feel natural and appropriate? When does it not? How do we make the experience as intuitive as possible?

At a basic level, drag-and-drop is intuitive because it maximizes the correlation between interaction and representation. At each moment in the drag, the user is given the immediate feedback of what their completed action would mean. Drag-and-drop interfaces are everywhere, so let’s illustrate with an example.

The Chrome tab system allows the user to freely reposition tabs using drag-and-drop. As the user drags, the view is instantly updated to reflect the new tab position, whether or not the user ultimately chooses that position. This reduces cognitive load, as the user doesn’t have to imagine anything: What You See is What You Get. This complete visual correlation between interaction and result also allows the user to make use of the representation to help make their decision. They can drag a tab back and forth and the tab flow updates automatically, allowing the user to test out different positionings.

To use terms from Don Norman’s design vocabulary, great drag-and-drop controls maximize mapping: the interface creates an effective correlation between interaction and result, reducing the need to learn or remember anything new. It’s important, then, to make clear exactly what will happen when the user completes the drag, to give them feedback along the way: in the browser tab example, feedback takes the form of the surrounding tabs’ movement.

Drag-and-drop interfaces don’t rely on on-screen controls, like buttons, drop-downs or textboxes. This is their strength, as it allows drag-and-drop to be as intuitive as point-and-click and turn-to-steer. But it can also present a problem: how does the user know that something is draggable? The chrome tabs change color on hover, but this signifier is rather abstract: it isn’t obvious that “brighter color” means “draggable,” and a user might not ever try. And so the first test for whether drag-and-drop is appropriate for a given interface is the question, “would I assume I might be able to drag it?”

It’s these things that I think about as I design and build the drag-and-drop notebook: creating an obvious mapping, giving appropriate feedback during the drag, and signifying that the drag is an option. It’s certainly a work-in-progress, as any initial, pre-user-research design must be, but it afforded my first foray into drag-and-drop on the web.

The Drag and Drop API

Before implementing anything, I had to learn the Web API for drag-and-drop. This essentially amounted to some articles on MDN, but it really isn’t the most intuitive API. I want to summarize the basic process in hopes that a step-by-step breakdown might be of use to someone, not the least to my future self!

Step 1: The draggable attribute

On your draggable element, you must setdraggable="true". If you don’t, it won’t register your drag events. (By the way, you can set draggable="false" to turn off the browser’s default dragging functionality for that element, which normally allows the user to drag text, images, and links around and into other programs.)

Step 2: The dragstart event

When the user begins to drag the element, it will trigger any registered dragstart event. Within that event, you set the parameters of the drag, which include:

  • the drag image (the ghosted image that shows up to represent the element during the drag). You set this with event.dataTransfer.setDragImage(someElement, xOffset, yOffset).
  • the drag data, which is a series of labeled pieces of data. You attach these pieces of data to the event using event.dataTransfer.setData(key, data). This data then gets passed through to the event.dataTransfer object on the drop event—we’ll get to that.
    • In the examples on the docs, key is always a kind of data type, like “text/plain”. In practice, it’s simply a key for the data, and you can choose anything.
    • You can only send strings as that data. If you do need to send a full object, you can stringify it or send each property as its own setDatacall.

Step 3: The dragenter and dragover events

Yep, there are two of these that you need to keep straight. Registering dragenter on an element will trigger that event when a drag enters the bounds of the element, while dragover triggers as long as the user is dragging within that element, and it will trigger repeatedly during that time. (If you want to trigger an event as long as the user drags a draggable element anywhere, you can use the drag event on the draggable element. These latter two events sounded pretty useless to me, but perhaps they can come in handy if you want reactions on precise locations within your drop elements.)

Here’s where the docs get confusing. According to this, you need to register and cancel both the dragenter and dragover events to designate an element as a drop target; according to this, you only need dragover (along with the drop event). I’ve been able to make things work using just the dragover event, but I add and cancel dragenter just to make sure: maybe it’s a documentation mistake, or maybe it’s platform-specific gotcha.

What do I mean by “cancel”? In order for the drop to register, you have to cancel the event by calling event.preventDefault() within the event listeners. In our case, I fulfill this by having a bunch of mostly-empty event listeners that just call event.preventDefault(), but you could use control statements to only call event.preventDefault() when a condition on the drag element or the drag data (both of which you can access through the listener’s event argument) are met.

You can also supposedly set the event’s drop effect within these listeners, which is supposed to give visual feedback by changing the cursor based on three available types of drag actions, but frankly I haven’t gotten it to work, and I find it easier to change the cursor by manipulating CSS. For now, I’m happy with the default:

Step 4: The drop event

They saved the most straightforward for last. Register a drop event listener on the target element and it will be called when the user has dropped the drag element on the target element. Within this listener you can access the data we set earlier by calling event.dataTransfer.getData(key). You can get an array of all the set keys with event.dataTransfer.types.

The docs tell you to call event.preventDefault() at the end of the event so it won’t call the browser’s default drop handler; I haven’t had an issue with this, but I can only assume it’s a good idea. I also call event.stopPropagation() on a parent so as not to trigger the drop event on its children.

There’s also a dragendevent called on the dragging element, but I’d suggest to keep all of your drop handling in one place.

Signifiers and Feedback: It shouldn’t be a drag to drag

Terrible pun aside, let’s talk about the basic visual feedback in this initial design, and how I applied the API with design principles in mind.

To signify that a highlight is draggable, I’ve set the cursor to grab:

For highlights that span more than one paragraph, I dynamically generate a drag image that contains all of the dragging elements, rather than just the one the user clicked on, to accurately represent what data is being transferred.

 

As for visualizations, in this design they all have this little dragger in the corner, which serves both as an indicator and as a handle for the drag. (We don’t want to make visualizations draggable from anywhere, as many of them have internal mouse events.)

As you can see in this gif, the entire notebook is registered as a drop handler; by default, it puts what you dragged at the bottom of the notebook. But there are also individual drop handlers before and after each node in the notebook. As you drag over one, it changes color, indicating your target position:

As I’m writing this, I’m still not sure whether these indicators are tall enough, and it might be better not to necessitate dragging on to them at all, and have a “snap” feature when you’re close enough. It’s a continual process of tweaking, based on my own experience using the design and that of my teammates and, eventually, our target users. For me, it isn’t tedious: it’s part of what makes web development dynamic and enjoyable.

]]>
https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-drag-and-drop/feed/ 0 1081
The Data by Design Notebook: Highlighting https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-highlighting/ https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-highlighting/#comments Sun, 07 Jun 2020 16:23:10 +0000 https://dhlab.lmc.gatech.edu/?p=1062 From the start, we knew that Data by Design was going to have a powerful dynamic notebook feature that would not only allow the reader to take notes on a chapter, but to add any part of that chapter into their notebook. Although it may sound like a self-contained feature, the notebook functionality involves multiple subsystems:

  • A highlight system (to allow the user to select text and save text, and to serialize the location of that highlight for later)
  • A drag-and-drop system (to allow the user to drag their highlights and other visualizations from the chapter to the notebook, and then to rearrange their notes)
  • A visualization state system (to keep track of the relationships between the visualizations and to manage the “copying” of the visualization when dragged into the notebook)
  • A user authentication system (to allow the user to create an account and to store and retrieve their notebook to and from the server)
  • The notebook user interface (to allow the user to add text notes and edit their notebook)

In this blog series, I’ll be describing a bit of the development process for each of those subsystems, starting in this post with the highlight system.

Note: The screen captures in this post use an early development version of both the chapter text and the notebook feature.

As mentioned, the goal of the highlight system is to allow the user to make visual highlights in the text. Those highlights can then be dragged into the notebook. Along with the content of the highlights, the precise location of the highlight is stored so that the user can click on the highlight in the notebook to go to its origin. Additionally, enough data must be stored about the highlight to repopulate it later, because we want to put all the highlights back in place when a user logs in.

This highlight feature was one of the most challenging and rewarding front-end features I’ve ever built. Lets dive into some of the challenges that came up along the way!

Challenge #1: It isn’t enough to just change the color

It isn’t enough to just change the color of the highlighted text; we have to treat a highlight as its own object. Imagine the following scenario: a user drags from point A to point B to create a highlight to drag that text into the notebook. Later, they drag from point B to point C. The user would expect to be able to drag this new highlight independently, but if we only kept track of the visual indication of the highlights, we wouldn’t be able to recognize it as its own highlight. The chief purpose of this highlighting functionality is to enable the user to select text and then drag that text to add it to the notebook, and it’s often the case to want to highlight two consecutive lines but comment on them separately.

While we must maintain the distinction between separate but juxtaposed highlights, we do want to allow the user to “edit” their highlight: to naturally expand a highlight by clicking and dragging outside an existing highlight to inside the highlight to select more text, or to subsume a highlight with a bigger one. In other words, we must differentiate between selections that represent a new highlight and those that represent a change to an old one.

Challenge #2: We dynamically create, remove, and consolidate elements

In web development, there is no easy way to change the styling and functionality of part of an element; generally, you create a new element that contains that part, and then manipulate that new element. In our case, when the user highlights part of a paragraph, we need to create a new span that contains what they highlight.

This can get tricky. What happens when the user selects from one paragraph into another? We can’t create a child element across two parents! Instead, we analyze the user’s selection, and if it bleeds into another block element, we create another span to place there. But since both spans belong to the highlight, and we want them to be dragged together, we must keep track of their relationship. Additionally, when the user starts in one paragraph, drags over another and into a third, we have to turn the entire middle paragraph into a span, and connect it to both the span in the previous element and the span that comes after.

Another challenge: what if the user highlights text that starts in a non-styled part of the paragraph, but ends in a styled or interactive one, like a link? We must make sure that the new highlight span keeps the styles and attributes of the correct portion of the highlighted text.

Luckily, the browser provides the Range API, which helps extract the contents of a user’s highlight, even when it starts and stops in incongruous positions. By analyzing, manipulating, and reinjecting those elements, I was able to create a consistent solution.

Challenge #3: Serialization

When the user logs in, we want to populate the page with their previous highlights. To enable this, I had to find a way to represent a highlight as a string that we could save and load from the server. Once I built the procedure to turn a selection range into a highlight, I could use that to repopulate the highlights as long as I could save that selection range—a start position and end position on the page—to the server. What this essentially amounts to is taking an element and generating a unique selector string than can be saved and used to find that element.

It turns out that generating concise selector strings is not a trivial problem, with dozens of solutions and libraries existing online. Since I joined the team, one of our goals for Data by Design has been to keep the number of dependencies low, and most of these libraries would be various degrees of overkill: while these selector-generation libraries create efficient general-use CSS selection strings compatible with built-in functions, I didn’t care about the format of the strings, since I was going to use my own deserialization function.

Ultimately, I opted for a simple solution that stores the path from the closest ancestor with an id—given  that our element ids are unique—as well as the offset within that element. This solution might struggle if we expect the chapters to change after publication, but we can make the system more resilient to structural changes by adding ids to every low-level paragraph. (A system resilient to textual changes could save the actual HTML of the highlight and do a best fit when repopulating.)

Challenge #4: Grouping and Dragging

As mentioned earlier, multiple highlight span elements may make up a given highlight. That means that the user should be able to start dragging from any of them and expect that the whole group will come along. We thus need to store the relationship between the elements. (Recall the AB, BC example: we that we can’t simply check the DOM to see if there’s a highlight span directly after the dragged one, as that span might be part of a different highlight sequence.)

One way to do this is to make the Highlightable mixin (the object that handles all of this highlight creation logic) stateful: storing a reference to each highlight span, keeping track of their relationships, and providing lookup functionality to grab a highlight span’s buddies. I felt that the simpler solution was to put a little bit of state on the span elements themselves; namely, to add “overflow-next” and “overflow-prev” classes to signify when an element comes with a friend. Then, no matter the starting point, the drag event handler traverses and collects the elements. More on the drag-and-drop sequence in the next blog post!

Challenge #5: Non-highlightable elements

I saved the hardest for last. The multimedia chapters of Data by Design have more than just text; embedded images and interactive components can be dragged into the notebook, but it doesn’t make sense to make them highlightable in the same way that text is. Given that, what should the highlighting experience be like when the user encounters these elements? To maintain a non-obtrusive experience, I allow the user to select over these elements, but when they let go of the drag, any element that is not highlightable will not be highlighted.

This involves an array of allowed “highlightable” block element types (which can be configured by each chapter), as well as functionality to examine an element’s closest block parent. One of my first attempts involved removing the entire selection from the DOM and returning each of its block children to the DOM as either unchanged or as a highlight span element. However, my best solution involved figuring out the highlightable portions and only extracting and modifying those.

After the user finishes selecting, we break down the range into an array of smaller, highlightable subranges. That array is iterated, creating highlightable elements; the non-highlightable components aren’t added to this array and are left untouched.


That’s it for the highlight feature! While it’s tailor-made for Data by Design, it provides a feature that I think a lot of digital humanities projects could benefit from. I wrote (and rewrote and rewrote) more code for this feature than for any other part of this project, and I learned a lot about the ins-and-outs of the DOM and some of its less well-known APIs. As an undergrad, it felt amazing to be entrusted with a problem as complex is this one, and I’m proud of what I came up with.

]]>
https://dhlab.lmc.gatech.edu/databydesign/the-data-by-design-notebook-highlighting/feed/ 1 1062