-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathimplementation.tex
411 lines (353 loc) · 23 KB
/
implementation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
% vim:ts=1:et:nospell:spelllang=en_gb:ft=tex
\chapter{Implementation}
\label{implementation}
To implement the \emph{ppt2mxp} conversion tool that is the subject of this
thesis, we chose the Java programming language \citep{gosling-1}, version 8.
Although the author has significant experience with lots of other, more
interesting, more compelling, more fun languages, several reasons pushed us
towards Java, the least of them being its ease of use. Of course, Java
\emph{is} easy to use --- it would not have become as popular as it is
nowadays if it was not. It has a fairly clear and logical syntax, a consistent
structure, and an extensive standard library.
The vast and extensive amount of libraries available for Java was obviously
one of the more important reasons to make this choice. The existence of the
Apache POI library (see section \ref{poi}) was a huge help in reaching our
goal; without it, we would have had to figure out the very obfuscated\ .ppt
file format structure, which undoubtedly would have taken up more time than
was available to us. Other libraries like Spring, which allows the programmer
to use and reuse components without writing complex systems to instantiate
them, further increased our resolve to make Java our primary technology
choice.
However, Java is not the only technology used here. \mxp is written entirely
in HTML5, so any tool that somehow relates to \mxp sooner or later needs to
use HTML5 as well. The widely accepted HTML5 standard makes \mxp
presentations highly portable and runnable on any device with a recent web
browser, including smartphones and tablets \citep{roels-1}.
\fig{conversion}{An original \ppt slide (top) and the converted result
including automated layout in \mxp (bottom)}
\figref{conversion} shows the result of our implementation. We clearly
see the effects of our automated layout algorithm on the contents of this
slide In the following sections we discuss how the various technologies were
used to create the \emph{ppt2mxp} tool and the \code{autolayout} container
plug-in, the combination of which accomplishes this result.
\section{Taking \ppt apart}
\label{poi}
When converting one file format into another, the first part of the process
involves getting the data you need out of the original file. This can be
very complicated, as some --- usually proprietary --- file formats are
deliberately designed to discourage this. They obfuscate data, encrypt it,
and structure it in illogical and unexpected ways, amongst other techniques.
The \ppt file format unfortunately is such a format, as Microsoft would not
want to risk other companies making software that would work with \ppt
files. Of course, over the years people have managed to crack the format,
enabling the conversion of \ppt presentations into other formats, although
the conversion does not usually guarantee to yield results that mimic the
original version perfectly. Luckily, we do not want a perfect conversion, we
want a better one.
We found Apache POI library very helpful in this part of the implementation.
The POI\footnote{Originally ``Poor Obfuscation Implementation''
\citep{sundaram-1}} Library is a Java library that provides an API to access
Microsoft document formats. The most mature (and most popular) part of it is
HSSF\footnote{Horrible SpreadSheet Format}, which is used by Java developers
worldwide to access Microsoft Excel spreadsheet data, as well as export data
into Excel spreadsheets.
For our purposes, we relied on HSLF\footnote{``Horrible SLideshow Format''},
which provided us with a full API to access the contents of a \ppt
presentation's contents in a myriad of ways. We could access all images at
once, or every bit of text from the whole presentation, but the most
interesting to us was the ability to access contents on a per-slide basis.
Getting a list of the slides in a presentation first allowed us to group
contents within their immediate context, under a node per slide in our
component tree. As such, we could loop over the presentation's slides,
converting them one by one, by placing the contents of each slide in a \mxp
slide equivalent.
\subsection{Bullets}
That was unfortunately not the end of it. While HSLF does give us access to
all the text in a presentation, or per slide, it was not immediately clear
to us how it distinguished between `normal' text and bullet lists. This
meant for a long time our conversion process was incomplete, as all bullets
from the original \ppt presentation appeared as incoherent text runs in our
converted result. We found out about the \code{RichTextRun} class, which
had all the tools and properties to detect bullets and their indentation
level, but we only discovered very recently that we could extract
\code{RichTextRun}s from the \code{TextShape}s we were getting out of the
slides.
\begin{figure}[h!]
\begin{lstjava}{bullets-1}{Converting bullet points}
BulletList ul = new BulletList();
Stack<BulletList> listStack = new Stack<>();
int indent = 0;
for (RichTextRun run : textShape.getTextRun().getRichTextRuns()) {
ListItem li = new ListItem();
StringContent txt = new StringContent(StringUtils.strip(run.getText()));
if (run.isBullet()) {
if (run.getBulletOffset() > indent) {
indent = run.getBulletOffset();
listStack.push(ul);
ul = new BulletList();
listStack.peek().addItem(ul);
} else if (run.getBulletOffset() < indent) {
indent = run.getBulletOffset();
ul = listStack.pop();
}
} else {
while (listStack.size() > 0) {
// Current component is not a bulletlist or bullet, go to the top bullet level
ul = listStack.pop();
}
}
li.getContents().add(txt);
ul.addItem(li);
}
while (listStack.size() > 0) {
// Current component is not a bulletlist or bullet, go to the top bullet level
ul = listStack.pop();
}
return ul;
\end{lstjava}
\end{figure}
The code in \lstref{bullets-1} shows how we extract bullet points from a
\code{TextShape} object and convert them into the nested bullet format we
need. As you may notice, the original bullet points are not nested in any
way: they are all on the same level of the original object tree, no matter
their indentation level. This indentation level is also stored in the
bullet object. However, we wanted a more elegant solution where the
indentation would be deduced from the level of nesting, much like HTML has
always done.
The solution loops over the list of bullets, checking their indentation
level and building a stack of nested bullets accordingly. When increasing
the indentation level, a new bullet list is started and added as an item on
the existing bullet list. When indentation decreases, we go back to the
parent list and continue there. This can go down several levels, and all
the way back up of course. Since the objects do not have a direct link to
their parent, we use a stack to keep track of this at all times.
\subsection{Animations}
Another challenge was dealing with animations and other ways people managed
to put way more content on one slide than would be advisable. The
animations could not be transferred to \mxp since \mxp has its own way of
transitioning from each component to the next in the form of a
ZUI\footnote{Zoomable User Interface}. It would technically be possible to
implement additional animations as a separate plug-in for \mxp to provide
the equivalents of the animations in \ppt*, but that is beyond the scope of
this thesis. So we could not provide the same animations, but some people
use those animations not just to show off but to actually show multiple
pictures and blocks of text, one after the other, on the same slide.
Without animations, this content would either not be visible or it would
become a serious layout issue in \mxp.
Our first solution tried to limit the amount of objects one slide can
contain, and any additional content should be put on extra slides
automatically. A downside of this is that we had no way of guessing the
correct order in which the content should appear, so what may have been an
intrinsic choreography of pictures in \ppt might become an incoherent
jumble of images in \mxp. Another solution would be to scale all content
until it all fits next to each other on one slide, and then rely on the ZUI
to show the pictures one by one, but in this case the same problem with
order of appearance manifests itself. In the end, we decided it would be
best to accept that no conversion algorithm is going to be perfect, and the
author can always manually change the order around after the conversion is
done.
With this in mind, we now render the components in the order we get them
from HSLF, hoping that this resembles the original order closely. The
automated layout takes care of any overlapping that might have occurred
originally, so we do not have to worry about that.
\section{Generating \mxp content}
Generating \mxp presentations was the final goal of the first phase of this
thesis. This seemed a fairly easy task at first, until we learned that the
\mxp compiler would not be available to us for most of the year. This meant
we would either not be able to view-test our generated presentations, or we
would have to convert them to browser-ready HTML5 ourselves. We chose the
latter option, as not being able to see our results would not be very
helpful in implementing and tweaking our conversion tool. As a result, this
task became much more complicated, as we had to emulate the compiler's work
ourselves. Luckily we already decided we would be working with a Java object
representation of the original presentation as an intermediary form, a
so-called component tree, which meant we could easily change the output of
our conversion tool without affecting the rest of the conversion process,
and on a per-component basis.
\subsection{Emulating the \mxp compiler}
Since the \mxp compiler was not functional during most of this thesis'
implementation, we decided to generate an HTML5 file much like the \mxp
compiler would, including the \mxp JavaScript library and plug-ins. This
required us to first learn how \mxp works on the inside, which proved to be
a steep learning curve but gave us more insight into the software than we
would have gotten if we only had to generate \mxp XML and leave the rest to
the compiler.
\subsubsection{Improving the ZUI}
As an exercise, we changed the way the ZUI works. Originally, \mxp used
the CSS3 \code{transform: scale()} property to enlarge or reduce the whole
view, giving the impression of zooming in or out. This is an obvious
approach, simple in its execution and quite foolproof. However, the
downside is that you can not zoom in very much, because currently browsers
do not leverage the advantage of vector graphics and fonts even if you do
use them, and obviously raster-based content does not scale much anyway.
Instead, browsers render the content at its initial scale, and then treat
the result as one big image when scaled or otherwise transformed
afterwards. This means you get extremely pixelated content when zooming in
too much.
Through some refactoring, we were able to change this to use the
\code{transform: translateZ()} property instead, along with the
\code{transform: perspective()}\footnote{Not to be confused with the
\code{perspective: <number>} property, which yields different results} and
the \code{transform-style: preserve-3d} properties. This means we are now
effectively rendering the presentation in 3D, and moving our viewpoint
around in the 3D space to center each slide or component in turn.
We believe this opens the door for even more visually impressive
presentations, where content can be placed on different points along the
Z-axis. This allows for example to place multiple slides behind one
another, making for impressive zoom transitions between slides. The
downside of this is that the overview may not always show all content, as
some content can overlap, but we trust the author uses this feature wisely
when manually adjusting the position of their slides. It may for example
be useful to group slides together in this way, when there is too much
content to show on one slide but creating a second, separate slide may
break the flow of the presentation. In any case, our automated layout
plugin will not currently generate slides positioned this way.
\subsubsection{Plain HTML5}
After investigating the inner workings of \mxp and studying some example
presentations, we were ready to start generating our own presentations
based on our component tree. This meant every possible component would
have to be written out as valid HTML, with the necessary attributes for
each generated tag and with any child components enclosed. Since our
component tree nodes are nested the way the final HTML should be nested,
this was not a problem.
Our implementation currently includes \code{compile()} methods on every
component object, which is consistent and easy to understand, but which
might not be the best way to implement this depending on future goals.
Instead of having a separate layer taking care of the output, currently it
is a cross-cutting concern, which as we all know is not a desirable design
pattern. We have to walk through the entire tree in any case, so
performance will always be $O(n)$ at best.
The current implementation has the advantage of extensibility, where new
components can easily be added and it is immediately obvious to any new
developer how these new components should generate their equivalent HTML
code. However, for replacing the output with a different format it would
be better if this functionality was separated from the components and
gathered in a distinct \code{Writer} class instead. Switching formats
would then be as easy as dropping in a new \code{Writer} class that
generates a different format. Since we initially did not expect to switch
outputs, and because we started implementing the conversion of one
component and then added components as we needed them, the current
implementation --- focused on simplifying addition of new components ---
was easier for us to work with. Perhaps the refactoring of this
implementation is an option for future work.
\subsubsection{\mxp XML}
Generating \mxp XML should be simpler than the HTML5 output, although in
the end there is probably not much difference. We will not have to generate
unique IDs for every component, and we will not have to generate the
preamble content (which includes the \mxp library itself), but the
structure and content should remain pretty much the same. Instead of
\code{<div>} tags we can now use \mxp-specific tags. All that adds up to a
much more easily readable result.
The advantages towards the user would be even bigger. Using a hypothetical
\mxp editor, the output of our conversion tool could be edited in such a
tool immediately, while presumably such an editor would not work on plain
HTML5. The reuse of content --- one of the main goals of \mxp --- would
also be much more plausible using the XML format.
\section{Creating layouts}
Implementing an automated layout is not an easy task. At any point in the
process opportunities arise to use some kind of template, some sort of
design choice that would appear to make things easier, but turn out to be
restrictive when applied to edge cases. There is also not always a clear
distinction between implementing a template and implementing an automated
layout. If you decide to put all of your components in a row next to each
other, have you just implemented a template or not? What if you make them
all the same height, so the row will look aesthetically pleasing when
looking at it as a whole? What if you do not?
It becomes more clear when we have to work within a defined area, such as a
slide container. In this case, we have to fit our content within the slide;
this could be seen as a template decision but it is not one our algorithm
makes, so the algorithm itself just tries to fullfill the constraints it is
given. We can then calculate the relative sizes of our components, scale
them down or up together (using the same scale factor) until their combined
size equals the area we need to fill, and puzzle them together in a way that
fits. If no way is possible, we scale everything down some more and try
again, until we find something that works.
Whether using automated layout or not, the suggestion to not put too much
content on one slide remains. While an automated layout system may be able
to fit all content on one slide, too much content will still cause
information to be conveyed less effectively. Due to technical limitations of
currently existing browsers, scaling content down and then zooming in to
make it fill the screen is not always an option: browsers treat the content
as rasterized images rather than vectorized graphics, and scaling them does
not yield ideal results. Scaling an image down and then zooming in on it
will show you a blurred version at best, and a great big coloured blob at
worst. Using the \code{perspective()} and \code{translateZ()} properties
yields much better results, but this is not easily usable within a slide
container as the content would be rendered \emph{behind} the slide, making
the slide seem empty as seen from the front. If we want the content to be
visible on the slide, we need to render it within the size limits of the
slide, which gives the same blurred results as scaling.
The true power of automated layout generation becomes apparent when no such
boundaries are posed. If the content does not need to be scaled down, we can
use it all in its original size whether those sizes are similar or not.
Depending on which options the author of the presentation turns on, we can
then still resize content to match sizes, among other things.
\subsection{Using constraints}
The basis of our automated layout approach is the use of constraints. As a
first step in the process, we assign each component a certain `bounding
box', a set of constraints that dictates how close other components can get
to this particular component. This distance can be decided arbitrarily, or
have a hard-coded value, but we decided to take a more dynamic approach and
let the distance be 10\% of the width or height, whichever is the largest.
This way, the padding around the content is equal on all sides, and
proportionate to the size of the content.
The next step in the process decides the size of the components. There are
two mechanisms that may be applied here together, separately or not at all.
One mechanism is the equalization of sizes, where we resize all components
in such a manner that they end up all being the same or similar sizes. We
can decide to apply this either to the width, the height or the surface
area, depending on the effect we hope to create. If we want to put all
components size by side, making them all the same height might make for an
aesthetically pleasing result, for example; if we want to group them
together in a raster-like manner similar surface areas would usually be a
better idea.
The other mechanism is the scaling of components, in which we scale
everything up or down equally, maintaining relative proportions between
components while reducing or increasing the total used surface area. This
is especially useful when we have to generate an automated layout that
needs to fit within a predefined container with a fixed size, such as a
slide container.
The default approach when no such fixed-size container is present, is to
not apply either of these mechanisms, while we apply both mechanisms
together when we do have to fill this predefined area: we then calculate an
average surface area size between all components, resize all components to
match this average, and finally scale everything up or down so that the
total area fits the container.
It is possible to override this behaviour: when no fixed size is defined,
we can specify that all components should be sized equally, be it in width,
height or surface area. The latter is the default but this too can be
overridden: specifying that the components should be placed in a row or a
column instead of a raster-like structure will automatically choose the
matching equalizing method. Conversely, when there is a fixed size, we may
specify to retain the original proportions and only scale everything up or
down equally to fit the area.
\subsection{Alternative approaches}
During implementation, we tried several other approaches to improve upon the
automated layout, all of which we ultimately decided to abandon in favor of
the current approach. There were various reasons for abandoning these
paths. Often the reason was that we realized we were influencing the
automated layout process with human design decisions, which is exactly what
we were trying to avoid. Sometimes what we were trying did not yield the
results we hoped for, and sometimes the implementation became too complex
to continue or what seemed like a good idea in theory turned our to be
impossible in practice.
One such idea was to divide components in a raster-like structure, but
spread over a set of rows according to a normal distribution --- most
components in the middle rows, a few components on the first and last few
rows. We realized soon that on one side, this would become a kind of
template (a very dynamic one, but still), and on the other side, this was a
lot more complex than we initially thought. While this feature has been
abandoned in the current implementation we do think it may still be an
option for future work though: although it is a template, it seems dynamic
enough that it would still match the spirit of our automated layout
approach.
Another idea was to use the Z-axis to make components seem equal in size
when viewed from a certain angle as an overview, then zooming in on each
component to reveal their true size. This was abandoned purely for
complexity reasons: it is not enough to just place larger components
further back, you also need to adjust the x and y coordinated to make it
seem like the component is placed right next to another when in reality it
is a lot further back. This would not be necessary with an orthogonal
camera mode, but the HTML5 rendering engines only have a perspective camera
mode, which makes this idea too complex to implement within our limited
timeframe.