blob db669a72 (21342B) - Raw
1 \documentclass[a4paper]{article} 2 3 \usepackage[T1]{fontenc} 4 \usepackage[english]{babel} 5 \usepackage[utf8]{inputenc} 6 %\usepackage{a4wide} 7 \usepackage [autostyle, english=american]{csquotes} 8 \MakeOuterQuote{"} 9 \usepackage[maxbibnames=99,style=numeric,sorting=none]{biblatex} 10 \addbibresource{bib.bib} 11 \usepackage[pdfusetitle]{hyperref} 12 \usepackage{enumitem} 13 \usepackage[toc,page,title]{appendix} 14 \usepackage{caption} 15 \usepackage{subcaption} 16 \usepackage{gensymb} 17 \usepackage{units} 18 \usepackage{varwidth} 19 \usepackage{tabularx} 20 \usepackage{float} 21 \usepackage{tikz} 22 \usepackage{minted} 23 \usepackage{fancyvrb} 24 \input{version.inc} 25 \input{vars.inc} 26 27 % for layout debugging 28 \usepackage{layouts} 29 30 \newcommand{\onpage}[1]{\ref{#1} on page~\pageref{#1}} 31 \newcommand{\titlecite}[1]{\citetitle{#1} \cite{#1}} 32 \newcommand{\DP}{Douglas \& Peucker} 33 \newcommand{\VW}{Visvalingam--Whyatt} 34 \newcommand{\WM}{Wang--M{\"u}ller} 35 \newcommand{\MYTITLE}{Cartographic Generalization of Lines using free software (example of rivers)} 36 \newcommand{\MYAUTHOR}{Motiejus Jakštys} 37 38 \title{\MYTITLE} 39 \author{\MYAUTHOR} 40 \date{\VCDescribe} 41 42 \begin{document} 43 44 \begin{titlepage} 45 \begin{center} 46 \includegraphics[width=0.4\textwidth]{vu} 47 48 \huge 49 \textbf{\MYTITLE} \\[4ex] 50 51 \LARGE 52 \textbf{\MYAUTHOR} \\[8ex] 53 54 \vfill 55 56 A thesis presented for the degree of\\ 57 Master in Cartography \\[3ex] 58 59 \large 60 \VCDescribe 61 \end{center} 62 \end{titlepage} 63 64 \begin{abstract} 65 \label{sec:abstract} 66 Current open-source line generalization solutions have their roots in 67 mathematics and geometry, and are not fit for natural objects like rivers 68 and coastlines. This paper discusses our implementation of {\WM} algorithm 69 under and open-source license, explains things that we would had 70 appreciated in the original paper and compares our results to different 71 generalization algorithms. 72 \end{abstract} 73 74 \newpage 75 76 \tableofcontents 77 \listoffigures 78 79 \newpage 80 81 \section{Introduction} 82 \label{sec:introduction} 83 84 \iffalse 85 NOTICE: this value should be copied to layer2img.py:TEXTWIDTH, so dimensions 86 of inline images are reasonable. 87 88 textwidth in cm: {\printinunitsof{cm}\prntlen{\textwidth}} 89 \fi 90 91 When creating small-scale maps, often the detail of the data source is greater 92 than desired for the map. This becomes especially acute for natural features 93 that have many bends, like coastlines, rivers and forest boundaries. 94 95 To create a small-scale map from a large-scale data source, these features need 96 to be generalized: detail should be reduced. However, while doing so, it is 97 important to preserve the "defining" shape of the original feature, otherwise 98 the result will look unrealistic. 99 100 For example, if a river is nearly straight, it should be nearly straight after 101 generalization, otherwise a too straightened river will look like a canal. 102 Conversely, if the river is highly wiggly, the number of bends should be 103 reduced, but not removed. 104 105 Generalization problem for other objects can often be solved by other 106 non-geometric means: 107 108 \begin{itemize} 109 \item Towns and cities can be filtered and generalized by number of 110 inhabitants. 111 \item Roads can be eliminated by the road length, number of lanes, or 112 classification of the road (local, regional, international). 113 \end{itemize} 114 115 Natural line generalization problem can be viewed as having two competing 116 goals: 117 118 \begin{itemize} 119 \item Reduce detail by removing or simplifying "less important" features. 120 \item Retain enough detail, so the original is still recognize-able. 121 \end{itemize} 122 123 Given the discussed complexities, a fine line between under-generalization 124 (leaving object as-is) and over-generalization (making a straight line) must be 125 found. Therein lies the complexity of generalization algorithms: all have 126 different trade-offs. 127 128 \section{Literature review and problematic} 129 \label{sec:literature-review} 130 131 A number of cartographic line generalization algorithms have been researched. 132 The "classical" ones are {\DP} and {\VW}. 133 134 \subsection{Available algorithms} 135 136 \subsubsection{{\DP}, {\VW} and Chaikin's} 137 138 {\DP} \cite{douglas1973algorithms} and {\VW} \cite{visvalingam1993line} are 139 "classical" line generalization computer graphics algorithms. They are 140 relatively simple to implement, require few runtime resources. Both of them 141 accept only a single parameter, based on desired scale of the map, which makes 142 them very simple to adjust for different scales. 143 144 Both algorithms are part of PostGIS, a free-software GIS suite: 145 \begin{itemize} 146 \item {\DP} via 147 \href{https://postgis.net/docs/ST_Simplify.html}{PostGIS Simplify}. 148 149 \item {\VW} via 150 \href{https://postgis.net/docs/ST_SimplifyVW.html}{PostGIS SimplifyVW}. 151 \end{itemize} 152 153 Examples of <TBD Chaikin and others> 154 155 It may be worthwhile to post-process those through a widely available Chaikin's 156 line smoothing algorithm \cite{chaikin1974algorithm} via 157 \href{https://postgis.net/docs/ST_ChaikinSmoothing.html}{PostGIS 158 ChaikinSmoothing}. 159 160 161 \subsubsection{Modern approaches} 162 163 Due to their simplicity and ubiquity, {\DP} and {\VW} have been established as 164 go-to algorithms for line generalization. During recent years, alternatives 165 have emerged. These modern replacements fall into roughly two categories: 166 167 \begin{itemize} 168 169 \item Cartographic knowledge was encoded to an algorithm (bottom-up 170 approach). One among these are \titlecite{wang1998line}, also known 171 as {\WM}'s algorithm. 172 173 \item Mathematical shape transformation which yields a more cartographic 174 result. E.g. \titlecite{jiang2003line}, 175 \titlecite{dyken2009simultaneous}, \titlecite{mustafa2006dynamic}, 176 \titlecite{nollenburg2008morphing}. 177 178 \end{itemize} 179 180 Authors of most of the aforementioned articles have implemented the 181 generalization algorithm, at least to generate the visuals in the articles. 182 However, I wasn't able to find code for any of those to evaluate with my 183 desired data set, or use as a basis for my own maps. {\WM} \cite{wang1998line} 184 is available in a commercial product. 185 186 Lack of robust openly available generalization algorithm implementations poses 187 a problem for map creation with free software: there is not a similar 188 high-quality simplification algorithm to create down-scaled maps, so any 189 cartographic work, which uses line generalization as part of its processing, 190 will be of sub-par quality. We believe that availability of high-quality 191 open-source tools is an important foundation for future cartographic 192 experimentation and development, thus it it benefits the cartographic society 193 as a whole. 194 195 \subsection{Problematic with generalization of rivers} 196 197 \section{Methodology} 198 \label{sec:methodology} 199 200 The original {\WM}'s algorithm \cite{wang1998line} leaves something to be 201 desired for a practical implementation: it is not straightforward to implement 202 the algorithm from the paper alone. 203 204 Explanations in this document are meant to expand, rather than substitute, the 205 original description in {\WM}. Therefore familiarity with the original paper is 206 assumed, and, for some sections, having it close-by is necessary to 207 meaningfully follow this document. 208 209 In this paper we describe {\WM} in a detail that is more useful for algorithm: 210 each section will be expanded, with more elaborate and exact illustrations for 211 every step of the algorithm. 212 213 Algorithms discussed in this paper assume Euclidean geometry. 214 215 \subsection{Vocabulary and terminology} 216 217 This section defines vocabulary and terms as defined in the rest of the paper. 218 219 \begin{description} 220 221 \item[Vertex] is a point on a plane, can be expressed by a pair of $(x,y)$ 222 coordinates. 223 224 \item[Line Segment (or Segment)] joins two vertices by a straight line. A 225 segment can be expressed by two coordinate pairs: $(x_1, y_1)$ and 226 $(x_2, y_2)$. Line Segment and Segment are used interchangeably 227 throughout the paper. 228 229 \item[Line] represents a single linear feature in the real world. For 230 example, a river or a coastline. {\tt LINESTRING} in GIS terms. 231 232 Geometrically, A line is a series of connected line segments, or, 233 equivalently, a series of connected vertices. Each vertex connects to 234 two other vertices, except those vertices at either ends of the line: 235 these two connect to a single other vertex. 236 237 \item[Bend] is a subset of a line that humans perceive as a curve. The 238 geometric definition is complex and is discussed in 239 section~\onpage{sec:definition-of-a-bend}. 240 241 \item[Baseline] is a line between bend's first and last vertex. 242 243 \item[Sum of inner angles] TBD. 244 245 \end{description} 246 247 \subsection{Radians and Degrees} 248 249 This document contains a few constant angles expressed in radians. 250 Table~\ref{table:radians} summarizes some of the values used in this document 251 and the implementation. 252 253 \begin{table}[h] 254 \centering 255 \begin{tabular}{|c|c|c|c|c|c|c|} 256 \hline 257 Degrees & $30^\circ$ & $45^\circ$ & $90^\circ$ & $180^\circ$ & $360^\circ$ \\ 258 \hline 259 Radians & $\nicefrac{\pi}{6}$ & $\nicefrac{\pi}{4}$ & $\nicefrac{\pi}{2}$ & $\pi$ & $2\pi$ \\ 260 \hline 261 \end{tabular} 262 \caption{Popular degree and radian values} 263 \label{table:radians} 264 \end{table} 265 266 \subsection{Automated tests} 267 \label{sec:automated-tests} 268 269 As part of the algorithm realization, an automated test suite has been 270 developed. Shapes to test each function have been hand-crafted and expected 271 results have been manually calculated. The test suite executes parts of the 272 algorithm against a predefined set of geometries, and asserts that the output 273 matches the resulting hand-calculated geometry. 274 275 The full set of test geometries is visualized in 276 figure~\onpage{fig:test-figures}. The figure includes arrows depicting line 277 direction. 278 279 \begin{figure}[h] 280 \centering 281 \includegraphics[width=\textwidth]{test-figures} 282 \caption{Line geometries for automated test cases} 283 \label{fig:test-figures} 284 \end{figure} 285 286 The full test suite can be executed with a single command, and completes in a 287 few seconds. Having an easily accessible test suite boosts confidence that no 288 unexpected bugs have snug in while modifying the algorithm. 289 290 \section{Description of the implementation} 291 292 Like alluded in section~\onpage{sec:introduction}, {\WM} paper skims over 293 certain details, which are important to implement the algorithm. This section 294 goes through each algorithm stage, illustrating the intermediate steps and 295 explaining the author's desiderata for a more detailed description. 296 297 Illustrations of the following sections are extracted from the automated test 298 cases, which were written during the algorithm implementation (as discussed in 299 section~\onpage{sec:automated-tests}). 300 301 Lines in illustrations are black, and bends are heavily colored after 302 converting them to polygons. Bends are converted to polygons (for illustration 303 purposes) using the following algorithm: 304 305 \begin{itemize} 306 \item Join the first and last vertices of the bend, creating a polygon. 307 \item Color the polygons using distinct colors. 308 \end{itemize} 309 310 \subsection{Definition of a Bend} 311 \label{sec:definition-of-a-bend} 312 313 The original article describes a bend as: 314 315 \begin{displaycquote}{wang1998line} 316 A bend can be defined as that part of a line which contains a number of 317 subsequent vertices, with the inflection angles on all vertices included in 318 the bend being either positive or negative and the inflection of the bend's 319 two end vertices being in opposite signs. 320 \end{displaycquote} 321 322 While it gives a good intuitive understanding of what the bend is, this section 323 provides more technical details. Here are some non-obvious characteristics that 324 are necessary when writing code to detect the bends: 325 326 \begin{itemize} 327 \item End segments of each line should also belong to bends. That way, all 328 segments belong to 1 or 2 bends. 329 330 \item First and last segments of each bend (except for the two end-line 331 segments) is also the first vertex of the next bend. 332 \end{itemize} 333 334 Properties above may be apparent when looking at illustrations at this article 335 or reading here, but they are nowhere as such when looking at the original 336 article. 337 338 Figure~\ref{fig:fig8-definition-of-a-bend} illustrates article's Figure 8, 339 but with bends colored as polygons: each color is a distinctive bend. 340 341 \begin{figure}[h] 342 \centering 343 \includegraphics[width=\textwidth]{fig8-definition-of-a-bend} 344 \caption{Originally Figure 8: detected bends are highlighted} 345 \label{fig:fig8-definition-of-a-bend} 346 \end{figure} 347 348 \subsection{Gentle Inflection at End of a Bend} 349 350 The gist of the section is in the original article: 351 352 \begin{displaycquote}{wang1998line} 353 But if the inflection that marks the end of a bend is quite small, people 354 would not recognize this as the bend point of a bend 355 \end{displaycquote} 356 357 Figure~\ref{fig:fig5-gentle-inflection} visualizes original paper's Figure 5, 358 when a single vertex is moved outwards the end of the bend. 359 360 \begin{figure}[h] 361 \centering 362 \begin{subfigure}[b]{.45\textwidth} 363 \includegraphics[width=\textwidth]{fig5-gentle-inflection-before} 364 \caption{Before applying the inflection rule} 365 \end{subfigure} 366 \hfill 367 \begin{subfigure}[b]{.45\textwidth} 368 \includegraphics[width=\textwidth]{fig5-gentle-inflection-after} 369 \caption{After applying the inflection rule} 370 \end{subfigure} 371 \caption{Originally Figure 5: gentle inflections at the ends of the bend} 372 \label{fig:fig5-gentle-inflection} 373 \end{figure} 374 375 The illustration for this section was clear, but insufficient: it does not 376 specify how many vertices should be included when calculating the end-of-bend 377 inflection. We chose the iterative approach --- as long as the angle is "right" 378 and the distance is decreasing, the algorithm should keep re-assigning vertices 379 to different bends; practically not having an upper bound on the number of 380 iterations. 381 382 To prove that the algorithm implementation is correct for multiple vertices, 383 additional example was created, and illustrated in 384 figure~\ref{fig:inflection-1-gentle-inflection}: the rule re-assigns two 385 vertices to the next bend instead of one. 386 387 \begin{figure}[h] 388 \centering 389 \begin{subfigure}[b]{.45\textwidth} 390 \includegraphics[width=\textwidth]{inflection-1-gentle-inflection-before} 391 \caption{Before applying the inflection rule} 392 \end{subfigure} 393 \hfill 394 \begin{subfigure}[b]{.45\textwidth} 395 \includegraphics[width=\textwidth]{inflection-1-gentle-inflection-after} 396 \caption{After applying the inflection rule} 397 \end{subfigure} 398 \caption{Gentle inflection at the end of the bend when multiple vertices is moved} 399 \label{fig:inflection-1-gentle-inflection} 400 \end{figure} 401 402 To find and fix the gentle bends' inflections requires to run the algorithm in 403 both directions; if implemented as documented, the steps will fail to match 404 some bends that should be mutated. This implementation does it in the following 405 way: 406 407 \begin{enumerate} 408 \item Run the algorithm from beginning to the end. 409 \item \label{rev1} Reverse the line and each bend. 410 \item Run the algorithm again. 411 \item \label{rev2} Reverse the line and each bend. 412 \item Return result. 413 \end{enumerate} 414 415 The current implementation is the most straightforward, but not optimal: 416 reversing of lines and bends could be avoided by walking backwards the lines. 417 In this case, steps \ref{rev1} and \ref{rev2} could be spared, thus saving 418 memory and computation time. 419 420 The "quite small angle" was arbitrarily chosen to $\smallAngle$. 421 422 \subsection{Self-line Crossing When Cutting a Bend} 423 424 When bend's baseline crosses another bend, it is called self-crossing. This is 425 undesirable in the upcoming operators, and self-crossings should be removed 426 following the rules of the article. 427 428 \begin{figure}[h] 429 \centering 430 \begin{subfigure}[b]{.45\textwidth} 431 \includegraphics[width=\textwidth]{fig6-selfcrossing-before} 432 \caption{Bend's baseline (dotted) is crossing a neighboring bend} 433 \end{subfigure} 434 \hfill 435 \begin{subfigure}[b]{.45\textwidth} 436 \includegraphics[width=\textwidth]{fig6-selfcrossing-after} 437 \caption{Self-crossing removed following the algorithm} 438 \end{subfigure} 439 \caption{Originally Figure 6: simple case of self-line crossing} 440 \label{fig:fig6-selfcrossing} 441 \end{figure} 442 443 The self-line-crossing may happen not by the neighboring bend, but by any other 444 bend in the line. For example, the baseline of the bend $(A, B)$ may cross 445 different bends in between, as depicted in 446 figure~\onpage{fig:selfcrossing-1-non-neighbor}. 447 448 \begin{figure}[h] 449 \centering 450 \begin{subfigure}[b]{.45\textwidth} 451 \includegraphics[width=\textwidth]{selfcrossing-1-before} 452 \caption{Bend's baseline (dotted) is crossing a non-neighboring bend} 453 \end{subfigure} 454 \hfill 455 \begin{subfigure}[b]{.45\textwidth} 456 \includegraphics[width=\textwidth]{selfcrossing-1-after} 457 \caption{Self-crossing removed following the algorithm} 458 \end{subfigure} 459 \caption{Self-crossing with non-neighboring bend} 460 \label{fig:selfcrossing-1-non-neighbor} 461 \end{figure} 462 463 Naively implemented, checking every bend with every bend is costs $O(n^2)$. In 464 other words, the time it takes to run the algorithm grows quadratically with 465 the with the number of vertices. 466 467 It is possible to optimize this step and skip checking some of the bends. Only 468 bends whose sum of inner angles is $\pi$ can ever self-cross. If the value is 469 less than $\pi$, it cannot cross other bends. That way, only a fraction of 470 bends need to be checked. 471 472 \subsection{Attributes of a Single Bend} 473 474 475 476 \textsc{Compactness Index} is "the ratio of the area of the polygon over the 477 circle whose circumference length is the same as the length of the 478 circumference of the polygon" \cite{wang1998line}. Given a bend, its 479 compactness index is calculated as follows: 480 481 \begin{enumerate} 482 483 \item Construct a polygon by joining first and last vertices of the bend. 484 485 \item Calculate area of the polygon $P$. 486 487 \item Calculate perimeter of the polygon $u$. The same value is the 488 circumference of the circle. 489 490 \item Given circle's perimeter $u$, circle's area $A$ is: 491 492 \[ 493 A = \frac{u^2}{4\pi} 494 \] 495 496 \item Compactness index is $\nicefrac{P}{A}$: 497 498 \[ 499 cmp = \frac{P}{A} = \frac{P}{ \frac{u^2}{4\pi} } = \frac{4\pi P}{u^2} 500 \] 501 502 \end{enumerate} 503 504 Other than that, once this section is implemented, each bend will have a list 505 of properties, upon which actions later will be performed. 506 507 \subsection{Shape of a Bend} 508 509 This section introduces \textsc{adjusted size}, which trivially derives from 510 \textsc{compactness index} $cmp$ and shape's area $A$: 511 512 \[ 513 adjsize = \frac{0.75 A}{cmp} 514 \] 515 516 Adjusted size becomes necessary later to compare bends with each other, and 517 find out similar ones. 518 519 \subsection{Isolated Bend} 520 521 Bend itself and its "isolation" can be described by \textsc{average curvature}, 522 which is \textcquote{wang1998line}{geometrically defined as the ratio of 523 inflection over the length of a curve.} 524 525 Two conditions must be true to claim that a bend is isolated: 526 527 \begin{enumerate} 528 \item \textsc{average curvature} of neighboring bends, should be larger 529 than the "candidate" bend's curvature; this implementation arbitrarily 530 chose $\isolationThreshold$. 531 532 \item Bends on both sides of the "candidate" should be longer than a 533 certain value. This implementation does not (yet) define such a 534 constraint and will only follow the average curvature constraint above. 535 \end{enumerate} 536 537 \subsection{The Context of a Bend: Isolated and Similar Bends} 538 539 To find out whether two bends are similar, they are compared by 3 components: 540 541 \begin{enumerate} 542 \item \textsc{adjusted size} 543 \item \textsc{compactness index} 544 \item Baseline length 545 \end{enumerate} 546 547 These 3 components represent a point in the 3-dimensional space, and Euclidean 548 distance $d$ between those is calculated to differentiate between bends $p$ and 549 $q$: 550 551 \[ 552 d(p,q) = \sqrt{(adjsize_p-adjsize_q)^2 + 553 (cmp_p-cmp_q)^2 + 554 (baseline_p-baseline_q)^2} 555 \] 556 557 The more similar the bends are, the smaller the distance $d$. 558 559 \subsection{Elimination Operator} 560 561 \subsection{Combination Operator} 562 563 \subsection{Exaggeration Operator} 564 565 \section{Program Implementation} 566 567 \section{Results of Experiments} 568 569 \section{Conclusions} 570 \label{sec:conclusions} 571 572 \section{Related Work and future suggestions} 573 \label{sec:related_work} 574 575 \printbibliography 576 577 \begin{appendices} 578 579 \section{Code listings} 580 581 \subsection{Reproducing the generalizations in this paper} 582 583 We strongly believe in the ability to reproduce the results is critical for any 584 scientific work. To make it possible for this paper, all source files and 585 accompanying scripts have been attached to the PDF. To re-generate this 586 document and its accompanying graphics, run this script (assuming name of 587 this document is {\tt mj-msc-full.pdf}): 588 589 \inputminted[fontsize=\small]{bash}{extract-and-generate} 590 591 This was tested on Linux Debian 11 with upstream packages only. 592 593 \subsection{Algorithm code listings} 594 \inputminted[fontsize=\small]{postgresql}{wm.sql} 595 596 \end{appendices} 597 \end{document}