stud

study spacejunk
Log | Files | Refs | LICENSE

blob db669a72 (21342B) - Raw


      1 \documentclass[a4paper]{article}
      2 
      3 \usepackage[T1]{fontenc}
      4 \usepackage[english]{babel}
      5 \usepackage[utf8]{inputenc}
      6 %\usepackage{a4wide}
      7 \usepackage [autostyle, english=american]{csquotes}
      8 \MakeOuterQuote{"}
      9 \usepackage[maxbibnames=99,style=numeric,sorting=none]{biblatex}
     10 \addbibresource{bib.bib}
     11 \usepackage[pdfusetitle]{hyperref}
     12 \usepackage{enumitem}
     13 \usepackage[toc,page,title]{appendix}
     14 \usepackage{caption}
     15 \usepackage{subcaption}
     16 \usepackage{gensymb}
     17 \usepackage{units}
     18 \usepackage{varwidth}
     19 \usepackage{tabularx}
     20 \usepackage{float}
     21 \usepackage{tikz}
     22 \usepackage{minted}
     23 \usepackage{fancyvrb}
     24 \input{version.inc}
     25 \input{vars.inc}
     26 
     27 % for layout debugging
     28 \usepackage{layouts}
     29 
     30 \newcommand{\onpage}[1]{\ref{#1} on page~\pageref{#1}}
     31 \newcommand{\titlecite}[1]{\citetitle{#1} \cite{#1}}
     32 \newcommand{\DP}{Douglas \& Peucker}
     33 \newcommand{\VW}{Visvalingam--Whyatt}
     34 \newcommand{\WM}{Wang--M{\"u}ller}
     35 \newcommand{\MYTITLE}{Cartographic Generalization of Lines using free software (example of rivers)}
     36 \newcommand{\MYAUTHOR}{Motiejus Jakštys}
     37 
     38 \title{\MYTITLE}
     39 \author{\MYAUTHOR}
     40 \date{\VCDescribe}
     41 
     42 \begin{document}
     43 
     44 \begin{titlepage}
     45     \begin{center}
     46         \includegraphics[width=0.4\textwidth]{vu}
     47 
     48         \huge
     49         \textbf{\MYTITLE} \\[4ex]
     50 
     51         \LARGE
     52         \textbf{\MYAUTHOR} \\[8ex]
     53 
     54         \vfill
     55 
     56         A thesis presented for the degree of\\
     57         Master in Cartography \\[3ex]
     58 
     59         \large
     60         \VCDescribe
     61     \end{center}
     62 \end{titlepage}
     63 
     64 \begin{abstract}
     65 \label{sec:abstract}
     66 Current open-source line generalization solutions have their roots in
     67     mathematics and geometry, and are not fit for natural objects like rivers
     68     and coastlines. This paper discusses our implementation of {\WM} algorithm
     69     under and open-source license, explains things that we would had
     70     appreciated in the original paper and compares our results to different
     71     generalization algorithms.
     72 \end{abstract}
     73 
     74 \newpage
     75 
     76 \tableofcontents
     77 \listoffigures
     78 
     79 \newpage
     80 
     81 \section{Introduction}
     82 \label{sec:introduction}
     83 
     84 \iffalse
     85 NOTICE: this value should be copied to layer2img.py:TEXTWIDTH, so dimensions
     86 of inline images are reasonable.
     87 
     88 textwidth in cm: {\printinunitsof{cm}\prntlen{\textwidth}}
     89 \fi
     90 
     91 When creating small-scale maps, often the detail of the data source is greater
     92 than desired for the map. This becomes especially acute for natural features
     93 that have many bends, like coastlines, rivers and forest boundaries.
     94 
     95 To create a small-scale map from a large-scale data source, these features need
     96 to be generalized: detail should be reduced. However, while doing so, it is
     97 important to preserve the "defining" shape of the original feature, otherwise
     98 the result will look unrealistic.
     99 
    100 For example, if a river is nearly straight, it should be nearly straight after
    101 generalization, otherwise a too straightened river will look like a canal.
    102 Conversely, if the river is highly wiggly, the number of bends should be
    103 reduced, but not removed.
    104 
    105 Generalization problem for other objects can often be solved by other
    106 non-geometric means:
    107 
    108 \begin{itemize}
    109     \item Towns and cities can be filtered and generalized by number of
    110         inhabitants.
    111     \item Roads can be eliminated by the road length, number of lanes, or
    112         classification of the road (local, regional, international).
    113 \end{itemize}
    114 
    115 Natural line generalization problem can be viewed as having two competing
    116 goals:
    117 
    118 \begin{itemize}
    119     \item Reduce detail by removing or simplifying "less important" features.
    120     \item Retain enough detail, so the original is still recognize-able.
    121 \end{itemize}
    122 
    123 Given the discussed complexities, a fine line between under-generalization
    124 (leaving object as-is) and over-generalization (making a straight line) must be
    125 found. Therein lies the complexity of generalization algorithms: all have
    126 different trade-offs.
    127 
    128 \section{Literature review and problematic}
    129 \label{sec:literature-review}
    130 
    131 A number of cartographic line generalization algorithms have been researched.
    132 The "classical" ones are {\DP} and {\VW}.
    133 
    134 \subsection{Available algorithms}
    135 
    136 \subsubsection{{\DP}, {\VW} and Chaikin's}
    137 
    138 {\DP} \cite{douglas1973algorithms} and {\VW} \cite{visvalingam1993line} are
    139 "classical" line generalization computer graphics algorithms. They are
    140 relatively simple to implement, require few runtime resources. Both of them
    141 accept only a single parameter, based on desired scale of the map, which makes
    142 them very simple to adjust for different scales.
    143 
    144 Both algorithms are part of PostGIS, a free-software GIS suite:
    145 \begin{itemize}
    146     \item {\DP} via
    147         \href{https://postgis.net/docs/ST_Simplify.html}{PostGIS Simplify}.
    148 
    149     \item {\VW} via
    150         \href{https://postgis.net/docs/ST_SimplifyVW.html}{PostGIS SimplifyVW}.
    151 \end{itemize}
    152 
    153 Examples of <TBD Chaikin and others>
    154 
    155 It may be worthwhile to post-process those through a widely available Chaikin's
    156 line smoothing algorithm \cite{chaikin1974algorithm} via
    157 \href{https://postgis.net/docs/ST_ChaikinSmoothing.html}{PostGIS
    158 ChaikinSmoothing}.
    159 
    160 
    161 \subsubsection{Modern approaches}
    162 
    163 Due to their simplicity and ubiquity, {\DP} and {\VW} have been established as
    164 go-to algorithms for line generalization. During recent years, alternatives
    165 have emerged. These modern replacements fall into roughly two categories:
    166 
    167 \begin{itemize}
    168 
    169     \item Cartographic knowledge was encoded to an algorithm (bottom-up
    170         approach). One among these are \titlecite{wang1998line}, also known
    171         as {\WM}'s algorithm.
    172 
    173     \item Mathematical shape transformation which yields a more cartographic
    174         result. E.g. \titlecite{jiang2003line},
    175         \titlecite{dyken2009simultaneous}, \titlecite{mustafa2006dynamic},
    176         \titlecite{nollenburg2008morphing}.
    177 
    178 \end{itemize}
    179 
    180 Authors of most of the aforementioned articles have implemented the
    181 generalization algorithm, at least to generate the visuals in the articles.
    182 However, I wasn't able to find code for any of those to evaluate with my
    183 desired data set, or use as a basis for my own maps. {\WM} \cite{wang1998line}
    184 is available in a commercial product.
    185 
    186 Lack of robust openly available generalization algorithm implementations poses
    187 a problem for map creation with free software: there is not a similar
    188 high-quality simplification algorithm to create down-scaled maps, so any
    189 cartographic work, which uses line generalization as part of its processing,
    190 will be of sub-par quality. We believe that availability of high-quality
    191 open-source tools is an important foundation for future cartographic
    192 experimentation and development, thus it it benefits the cartographic society
    193 as a whole.
    194 
    195 \subsection{Problematic with generalization of rivers}
    196 
    197 \section{Methodology}
    198 \label{sec:methodology}
    199 
    200 The original {\WM}'s algorithm \cite{wang1998line} leaves something to be
    201 desired for a practical implementation: it is not straightforward to implement
    202 the algorithm from the paper alone.
    203 
    204 Explanations in this document are meant to expand, rather than substitute, the
    205 original description in {\WM}. Therefore familiarity with the original paper is
    206 assumed, and, for some sections, having it close-by is necessary to
    207 meaningfully follow this document.
    208 
    209 In this paper we describe {\WM} in a detail that is more useful for algorithm:
    210 each section will be expanded, with more elaborate and exact illustrations for
    211 every step of the algorithm.
    212 
    213 Algorithms discussed in this paper assume Euclidean geometry.
    214 
    215 \subsection{Vocabulary and terminology}
    216 
    217 This section defines vocabulary and terms as defined in the rest of the paper.
    218 
    219 \begin{description}
    220 
    221     \item[Vertex] is a point on a plane, can be expressed by a pair of $(x,y)$
    222         coordinates.
    223 
    224     \item[Line Segment (or Segment)] joins two vertices by a straight line. A
    225         segment can be expressed by two coordinate pairs: $(x_1, y_1)$ and
    226         $(x_2, y_2)$. Line Segment and Segment are used interchangeably
    227         throughout the paper.
    228 
    229     \item[Line] represents a single linear feature in the real world. For
    230         example, a river or a coastline. {\tt LINESTRING} in GIS terms.
    231 
    232         Geometrically, A line is a series of connected line segments, or,
    233         equivalently, a series of connected vertices. Each vertex connects to
    234         two other vertices, except those vertices at either ends of the line:
    235         these two connect to a single other vertex.
    236 
    237     \item[Bend] is a subset of a line that humans perceive as a curve. The
    238         geometric definition is complex and is discussed in
    239         section~\onpage{sec:definition-of-a-bend}.
    240 
    241     \item[Baseline] is a line between bend's first and last vertex.
    242 
    243     \item[Sum of inner angles] TBD.
    244 
    245 \end{description}
    246 
    247 \subsection{Radians and Degrees}
    248 
    249 This document contains a few constant angles expressed in radians.
    250 Table~\ref{table:radians} summarizes some of the values used in this document
    251 and the implementation.
    252 
    253 \begin{table}[h]
    254     \centering
    255     \begin{tabular}{|c|c|c|c|c|c|c|}
    256         \hline
    257         Degrees & $30^\circ$          & $45^\circ$          & $90^\circ$          & $180^\circ$ & $360^\circ$ \\
    258         \hline
    259         Radians & $\nicefrac{\pi}{6}$ & $\nicefrac{\pi}{4}$ & $\nicefrac{\pi}{2}$ & $\pi$       & $2\pi$ \\
    260         \hline
    261     \end{tabular}
    262     \caption{Popular degree and radian values}
    263     \label{table:radians}
    264 \end{table}
    265 
    266 \subsection{Automated tests}
    267 \label{sec:automated-tests}
    268 
    269 As part of the algorithm realization, an automated test suite has been
    270 developed. Shapes to test each function have been hand-crafted and expected
    271 results have been manually calculated. The test suite executes parts of the
    272 algorithm against a predefined set of geometries, and asserts that the output
    273 matches the resulting hand-calculated geometry.
    274 
    275 The full set of test geometries is visualized in
    276 figure~\onpage{fig:test-figures}. The figure includes arrows depicting line
    277 direction.
    278 
    279 \begin{figure}[h]
    280     \centering
    281     \includegraphics[width=\textwidth]{test-figures}
    282     \caption{Line geometries for automated test cases}
    283     \label{fig:test-figures}
    284 \end{figure}
    285 
    286 The full test suite can be executed with a single command, and completes in a
    287 few seconds. Having an easily accessible test suite boosts confidence that no
    288 unexpected bugs have snug in while modifying the algorithm.
    289 
    290 \section{Description of the implementation}
    291 
    292 Like alluded in section~\onpage{sec:introduction}, {\WM} paper skims over
    293 certain details, which are important to implement the algorithm. This section
    294 goes through each algorithm stage, illustrating the intermediate steps and
    295 explaining the author's desiderata for a more detailed description.
    296 
    297 Illustrations of the following sections are extracted from the automated test
    298 cases, which were written during the algorithm implementation (as discussed in
    299 section~\onpage{sec:automated-tests}).
    300 
    301 Lines in illustrations are black, and bends are heavily colored after
    302 converting them to polygons. Bends are converted to polygons (for illustration
    303 purposes) using the following algorithm:
    304 
    305 \begin{itemize}
    306     \item Join the first and last vertices of the bend, creating a polygon.
    307     \item Color the polygons using distinct colors.
    308 \end{itemize}
    309 
    310 \subsection{Definition of a Bend}
    311 \label{sec:definition-of-a-bend}
    312 
    313 The original article describes a bend as:
    314 
    315 \begin{displaycquote}{wang1998line}
    316     A bend can be defined as that part of a line which contains a number of
    317     subsequent vertices, with the inflection angles on all vertices included in
    318     the bend being either positive or negative and the inflection of the bend's
    319     two end vertices being in opposite signs.
    320 \end{displaycquote}
    321 
    322 While it gives a good intuitive understanding of what the bend is, this section
    323 provides more technical details. Here are some non-obvious characteristics that
    324 are necessary when writing code to detect the bends:
    325 
    326 \begin{itemize}
    327     \item End segments of each line should also belong to bends. That way, all
    328         segments belong to 1 or 2 bends.
    329 
    330     \item First and last segments of each bend (except for the two end-line
    331         segments) is also the first vertex of the next bend.
    332 \end{itemize}
    333 
    334 Properties above may be apparent when looking at illustrations at this article
    335 or reading here, but they are nowhere as such when looking at the original
    336 article.
    337 
    338 Figure~\ref{fig:fig8-definition-of-a-bend} illustrates article's Figure 8,
    339 but with bends colored as polygons: each color is a distinctive bend.
    340 
    341 \begin{figure}[h]
    342     \centering
    343     \includegraphics[width=\textwidth]{fig8-definition-of-a-bend}
    344     \caption{Originally Figure 8: detected bends are highlighted}
    345     \label{fig:fig8-definition-of-a-bend}
    346 \end{figure}
    347 
    348 \subsection{Gentle Inflection at End of a Bend}
    349 
    350 The gist of the section is in the original article:
    351 
    352 \begin{displaycquote}{wang1998line}
    353     But if the inflection that marks the end of a bend is quite small, people
    354     would not recognize this as the bend point of a bend
    355 \end{displaycquote}
    356 
    357 Figure~\ref{fig:fig5-gentle-inflection} visualizes original paper's Figure 5,
    358 when a single vertex is moved outwards the end of the bend.
    359 
    360 \begin{figure}[h]
    361     \centering
    362     \begin{subfigure}[b]{.45\textwidth}
    363         \includegraphics[width=\textwidth]{fig5-gentle-inflection-before}
    364         \caption{Before applying the inflection rule}
    365     \end{subfigure}
    366     \hfill
    367     \begin{subfigure}[b]{.45\textwidth}
    368         \includegraphics[width=\textwidth]{fig5-gentle-inflection-after}
    369         \caption{After applying the inflection rule}
    370     \end{subfigure}
    371     \caption{Originally Figure 5: gentle inflections at the ends of the bend}
    372     \label{fig:fig5-gentle-inflection}
    373 \end{figure}
    374 
    375 The illustration for this section was clear, but insufficient: it does not
    376 specify how many vertices should be included when calculating the end-of-bend
    377 inflection. We chose the iterative approach --- as long as the angle is "right"
    378 and the distance is decreasing, the algorithm should keep re-assigning vertices
    379 to different bends; practically not having an upper bound on the number of
    380 iterations.
    381 
    382 To prove that the algorithm implementation is correct for multiple vertices,
    383 additional example was created, and illustrated in
    384 figure~\ref{fig:inflection-1-gentle-inflection}: the rule re-assigns two
    385 vertices to the next bend instead of one.
    386 
    387 \begin{figure}[h]
    388     \centering
    389     \begin{subfigure}[b]{.45\textwidth}
    390         \includegraphics[width=\textwidth]{inflection-1-gentle-inflection-before}
    391         \caption{Before applying the inflection rule}
    392     \end{subfigure}
    393     \hfill
    394     \begin{subfigure}[b]{.45\textwidth}
    395         \includegraphics[width=\textwidth]{inflection-1-gentle-inflection-after}
    396         \caption{After applying the inflection rule}
    397     \end{subfigure}
    398     \caption{Gentle inflection at the end of the bend when multiple vertices is moved}
    399     \label{fig:inflection-1-gentle-inflection}
    400 \end{figure}
    401 
    402 To find and fix the gentle bends' inflections requires to run the algorithm in
    403 both directions; if implemented as documented, the steps will fail to match
    404 some bends that should be mutated. This implementation does it in the following
    405 way:
    406 
    407 \begin{enumerate}
    408     \item Run the algorithm from beginning to the end.
    409     \item \label{rev1} Reverse the line and each bend.
    410     \item Run the algorithm again.
    411     \item \label{rev2} Reverse the line and each bend.
    412     \item Return result.
    413 \end{enumerate}
    414 
    415 The current implementation is the most straightforward, but not optimal:
    416 reversing of lines and bends could be avoided by walking backwards the lines.
    417 In this case, steps \ref{rev1} and \ref{rev2} could be spared, thus saving
    418 memory and computation time.
    419 
    420 The "quite small angle" was arbitrarily chosen to $\smallAngle$.
    421 
    422 \subsection{Self-line Crossing When Cutting a Bend}
    423 
    424 When bend's baseline crosses another bend, it is called self-crossing. This is
    425 undesirable in the upcoming operators, and self-crossings should be removed
    426 following the rules of the article.
    427 
    428 \begin{figure}[h]
    429     \centering
    430     \begin{subfigure}[b]{.45\textwidth}
    431         \includegraphics[width=\textwidth]{fig6-selfcrossing-before}
    432         \caption{Bend's baseline (dotted) is crossing a neighboring bend}
    433     \end{subfigure}
    434     \hfill
    435     \begin{subfigure}[b]{.45\textwidth}
    436         \includegraphics[width=\textwidth]{fig6-selfcrossing-after}
    437         \caption{Self-crossing removed following the algorithm}
    438     \end{subfigure}
    439     \caption{Originally Figure 6: simple case of self-line crossing}
    440     \label{fig:fig6-selfcrossing}
    441 \end{figure}
    442 
    443 The self-line-crossing may happen not by the neighboring bend, but by any other
    444 bend in the line. For example, the baseline of the bend $(A, B)$ may cross
    445 different bends in between, as depicted in
    446 figure~\onpage{fig:selfcrossing-1-non-neighbor}.
    447 
    448 \begin{figure}[h]
    449     \centering
    450     \begin{subfigure}[b]{.45\textwidth}
    451         \includegraphics[width=\textwidth]{selfcrossing-1-before}
    452         \caption{Bend's baseline (dotted) is crossing a non-neighboring bend}
    453     \end{subfigure}
    454     \hfill
    455     \begin{subfigure}[b]{.45\textwidth}
    456         \includegraphics[width=\textwidth]{selfcrossing-1-after}
    457         \caption{Self-crossing removed following the algorithm}
    458     \end{subfigure}
    459     \caption{Self-crossing with non-neighboring bend}
    460     \label{fig:selfcrossing-1-non-neighbor}
    461 \end{figure}
    462 
    463 Naively implemented, checking every bend with every bend is costs $O(n^2)$. In
    464 other words, the time it takes to run the algorithm grows quadratically with
    465 the with the number of vertices.
    466 
    467 It is possible to optimize this step and skip checking some of the bends. Only
    468 bends whose sum of inner angles is $\pi$ can ever self-cross. If the value is
    469 less than $\pi$, it cannot cross other bends. That way, only a fraction of
    470 bends need to be checked.
    471 
    472 \subsection{Attributes of a Single Bend}
    473 
    474 
    475 
    476 \textsc{Compactness Index} is "the ratio of the area of the polygon over the
    477 circle whose circumference length is the same as the length of the
    478 circumference of the polygon" \cite{wang1998line}. Given a bend, its
    479 compactness index is calculated as follows:
    480 
    481 \begin{enumerate}
    482 
    483   \item Construct a polygon by joining first and last vertices of the bend.
    484 
    485   \item Calculate area of the polygon $P$.
    486 
    487   \item Calculate perimeter of the polygon $u$. The same value is the
    488     circumference of the circle.
    489 
    490   \item Given circle's perimeter $u$, circle's area $A$ is:
    491 
    492     \[
    493       A = \frac{u^2}{4\pi}
    494     \]
    495 
    496   \item Compactness index is $\nicefrac{P}{A}$:
    497 
    498     \[
    499       cmp = \frac{P}{A} = \frac{P}{ \frac{u^2}{4\pi} } = \frac{4\pi P}{u^2}
    500     \]
    501 
    502 \end{enumerate}
    503 
    504 Other than that, once this section is implemented, each bend will have a list
    505 of properties, upon which actions later will be performed.
    506 
    507 \subsection{Shape of a Bend}
    508 
    509 This section introduces \textsc{adjusted size}, which trivially derives from
    510 \textsc{compactness index} $cmp$ and shape's area $A$:
    511 
    512 \[
    513     adjsize = \frac{0.75 A}{cmp}
    514 \]
    515 
    516 Adjusted size becomes necessary later to compare bends with each other, and
    517 find out similar ones.
    518 
    519 \subsection{Isolated Bend}
    520 
    521 Bend itself and its "isolation" can be described by \textsc{average curvature},
    522 which is \textcquote{wang1998line}{geometrically defined as the ratio of
    523 inflection over the length of a curve.}
    524 
    525 Two conditions must be true to claim that a bend is isolated:
    526 
    527 \begin{enumerate}
    528     \item \textsc{average curvature} of neighboring bends, should be larger
    529         than the "candidate" bend's curvature; this implementation arbitrarily
    530         chose $\isolationThreshold$.
    531 
    532     \item Bends on both sides of the "candidate" should be longer than a
    533         certain value. This implementation does not (yet) define such a
    534         constraint and will only follow the average curvature constraint above.
    535 \end{enumerate}
    536 
    537 \subsection{The Context of a Bend: Isolated and Similar Bends}
    538 
    539 To find out whether two bends are similar, they are compared by 3 components:
    540 
    541 \begin{enumerate}
    542     \item \textsc{adjusted size}
    543     \item \textsc{compactness index}
    544     \item Baseline length
    545 \end{enumerate}
    546 
    547 These 3 components represent a point in the 3-dimensional space, and Euclidean
    548 distance $d$ between those is calculated to differentiate between bends $p$ and
    549 $q$:
    550 
    551 \[
    552     d(p,q) = \sqrt{(adjsize_p-adjsize_q)^2 +
    553                    (cmp_p-cmp_q)^2 +
    554                    (baseline_p-baseline_q)^2}
    555 \]
    556 
    557 The more similar the bends are, the smaller the distance $d$.
    558 
    559 \subsection{Elimination Operator}
    560 
    561 \subsection{Combination Operator}
    562 
    563 \subsection{Exaggeration Operator}
    564 
    565 \section{Program Implementation}
    566 
    567 \section{Results of Experiments}
    568 
    569 \section{Conclusions}
    570 \label{sec:conclusions}
    571 
    572 \section{Related Work and future suggestions}
    573 \label{sec:related_work}
    574 
    575 \printbibliography
    576 
    577 \begin{appendices}
    578 
    579 \section{Code listings}
    580 
    581 \subsection{Reproducing the generalizations in this paper}
    582 
    583 We strongly believe in the ability to reproduce the results is critical for any
    584     scientific work. To make it possible for this paper, all source files and
    585     accompanying scripts have been attached to the PDF. To re-generate this
    586     document and its accompanying graphics, run this script (assuming name of
    587     this document is {\tt mj-msc-full.pdf}):
    588 
    589 \inputminted[fontsize=\small]{bash}{extract-and-generate}
    590 
    591 This was tested on Linux Debian 11 with upstream packages only.
    592 
    593 \subsection{Algorithm code listings}
    594 \inputminted[fontsize=\small]{postgresql}{wm.sql}
    595 
    596 \end{appendices}
    597 \end{document}