wm/mj-msc.tex

1745 lines
69 KiB
TeX
Raw Normal View History

2021-05-19 22:57:47 +03:00
\documentclass[a4paper]{article}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:49 +03:00
\usepackage[T1]{fontenc}
2021-05-19 22:57:49 +03:00
\usepackage[american]{babel}
2021-05-19 22:57:46 +03:00
\usepackage[utf8]{inputenc}
2021-05-19 22:57:50 +03:00
\usepackage{fvextra}
\usepackage[autostyle,english=american]{csquotes}
2021-05-19 22:57:47 +03:00
\MakeOuterQuote{"}
2021-05-19 22:57:50 +03:00
\usepackage[
maxbibnames=99,
style=numeric,
sorting=none,
alldates=iso,
seconds=true
]{biblatex}
2021-05-19 22:57:47 +03:00
\addbibresource{bib.bib}
2021-05-19 22:57:48 +03:00
\usepackage[
pdfusetitle,
2021-05-19 22:57:49 +03:00
pdfkeywords={Line Generalization,Line Simplification,Wang--Mueller},
2021-05-19 22:57:48 +03:00
pdfborderstyle={/S/U/W 0} % /S/U/W 1 to enable reasonable decorations
]{hyperref}
2021-05-19 22:57:46 +03:00
\usepackage{enumitem}
\usepackage[toc,page,title]{appendix}
\usepackage{caption}
\usepackage{subcaption}
2021-05-19 22:57:49 +03:00
\usepackage{dcolumn}
2021-05-19 22:57:46 +03:00
\usepackage{gensymb}
2021-05-19 22:57:47 +03:00
\usepackage{units}
2021-05-19 22:57:46 +03:00
\usepackage{varwidth}
\usepackage{tabularx}
\usepackage{float}
2021-05-19 22:57:49 +03:00
\usepackage{numprint}
2021-05-19 22:57:46 +03:00
\usepackage{tikz}
2021-05-19 22:57:50 +03:00
\usetikzlibrary{shapes.geometric,arrows,positioning}
2021-05-19 22:57:47 +03:00
\usepackage{fancyvrb}
2021-05-19 22:57:49 +03:00
\usepackage{layouts}
2021-05-19 22:57:50 +03:00
\usepackage{minted}
2021-05-19 22:57:48 +03:00
%\usepackage{charter}
%\usepackage{setspace}
%\doublespacing
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:49 +03:00
\input{version.inc}
\input{vars.inc}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:47 +03:00
\newcommand{\onpage}[1]{\ref{#1} on page~\pageref{#1}}
\newcommand{\titlecite}[1]{\citetitle{#1}\cite{#1}}
2021-05-19 22:57:50 +03:00
\newcommand{\titleciteauthor}[1]{\citetitle{#1} by \citeauthor{#1}\cite{#1}}
2021-05-19 22:57:46 +03:00
\newcommand{\DP}{Douglas \& Peucker}
\newcommand{\VW}{Visvalingam--Whyatt}
\newcommand{\WM}{Wang--M{\"u}ller}
2021-05-19 22:57:49 +03:00
\newcommand{\WnM}{Wang and M{\"u}ller}
2021-05-19 22:57:51 +03:00
\newcommand{\WirM}{Wang ir M{\"u}ller}
2021-05-19 22:57:48 +03:00
% {\WM} algoritmo realizacija kartografinei upių generalizacijai
2021-05-19 22:57:48 +03:00
\newcommand{\MYTITLE}{{\WM} algorithm realization for cartographic line generalization}
2021-05-19 22:57:49 +03:00
\newcommand{\MYTITLENOCAPS}{wang--m{\"u}ller algorithm realization for cartographic line generalization}
2021-05-19 22:57:46 +03:00
\newcommand{\MYAUTHOR}{Motiejus Jakštys}
2021-05-19 22:57:50 +03:00
\newcommand{\inputcode}[2]{\inputminted[fontsize=\small]{#1}{#2}}
2021-05-19 22:57:51 +03:00
\newenvironment{longlisting}{\captionsetup{type=listing}}{}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:46 +03:00
\title{\MYTITLE}
\author{\MYAUTHOR}
\date{\VCDescribe}
2021-05-19 22:57:46 +03:00
\begin{document}
2021-05-19 22:57:46 +03:00
\begin{titlepage}
\begin{center}
2021-05-19 22:57:49 +03:00
\includegraphics[width=0.2\textwidth]{vu.pdf} \\[4ex]
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:48 +03:00
\large
\textbf{\textsc{
vilnius university \\
faculty of chemistry and geosciences \\
department of cartography and geoinformatics
}} \\[8ex]
2021-05-19 22:57:46 +03:00
\textbf{\MYAUTHOR} \\[8ex]
2021-05-19 22:57:48 +03:00
\normalsize
2021-05-19 22:57:51 +03:00
A Thesis Presented for the Degree of Master in Cartography \\[8ex]
2021-05-19 22:57:48 +03:00
\LARGE
2021-05-19 22:57:49 +03:00
\textbf{\textsc{\MYTITLENOCAPS}}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:46 +03:00
\vfill
2021-05-19 22:57:48 +03:00
\normalsize
Supervisor Dr. Andrius Balčiūnas \\[16ex]
2021-05-19 22:57:46 +03:00
\VCDescribe
\end{center}
\end{titlepage}
2021-05-19 22:57:46 +03:00
\begin{abstract}
\label{sec:abstract}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
Currently available line simplification algorithms are rooted in
mathematics and geometry, and are unfit for bendy map features like rivers
and coastlines. {\WnM} observed how cartographers simplify these natural
2021-05-19 22:57:49 +03:00
features and created an algorithm. We implemented this algorithm and
documented it in great detail. Our implementation makes {\WM} algorithm
freely available in PostGIS, and this paper explains it.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\vfill
Šiuo metu prieinami linijų supaprastinimo algoritmai yra kilę iš
matematikos ir geometrijos, bei yra netinkami lankstiems geografiniams
objektams, tokiems kaip upės ir pakrantės. {\WirM} ištyrė, kaip kartografai
vykdo upių generalizaciją, ir sukūrė algoritmą. Mes realizavome šį
algoritmą ir išsamiai jį dokumentavome. Mūsų {\WM} realizacija ir
dokumentacija yra nemokami ir laisvai prieinami naudojant PostGIS
platformą.
2021-05-19 22:57:51 +03:00
\end{abstract}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:52 +03:00
\clearpage
2021-05-19 22:57:46 +03:00
\tableofcontents
2021-05-19 22:57:49 +03:00
2021-05-19 22:57:52 +03:00
\listoftables
2021-05-19 22:57:51 +03:00
\listoflistings
2021-05-19 22:57:46 +03:00
\newpage
2021-05-19 22:57:47 +03:00
\section{Introduction}
2021-05-19 22:57:46 +03:00
\label{sec:introduction}
2021-05-19 22:57:48 +03:00
\iffalse
NOTICE: this value should be copied to layer2img.py:TEXTWIDTH, so dimensions
of inline images are reasonable.
2021-05-19 22:57:48 +03:00
Textwidth in cm: {\printinunitsof{cm}\prntlen{\textwidth}}
2021-05-19 22:57:48 +03:00
\fi
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:46 +03:00
When creating small-scale maps, often the detail of the data source is greater
2021-05-19 22:57:48 +03:00
than desired for the map. While many features can be removed or simplified, it
is more tricky with natural features that have many bends, like coastlines,
2021-05-19 22:57:51 +03:00
rivers, or forest boundaries.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:48 +03:00
To create a small-scale map from a large-scale data source, features need to be
2021-05-19 22:57:49 +03:00
simplified, i.e., detail should be reduced. While performing the
simplification, it is important to retain the "defining" shape of the original
2021-05-19 22:57:51 +03:00
feature. Otherwise, if the simplified feature looks too different from the
2021-05-19 22:57:51 +03:00
original, the result will look unrealistic. Simplification problem for some
objects can often be solved by non-geometric means:
2021-05-19 22:57:46 +03:00
\begin{itemize}
2021-05-19 22:57:51 +03:00
\item Towns and cities can be filtered by the number of inhabitants.
2021-05-19 22:57:46 +03:00
\item Roads can be eliminated by the road length, number of lanes, or
classification of the road (local, regional, international).
\end{itemize}
2021-05-19 22:57:51 +03:00
However, things are not as simple for natural features like rivers or
coastlines. If a river is nearly straight, it should remain such after
simplification. An overly straightened river will look like a canal, and the
other way around --- too curvy would not reflect the natural shape. Conversely,
if the river originally is highly wiggly, the number of bends should be
reduced, but not removed altogether. Natural line simplification problem can be
viewed as a task of finding a delicate balance between two competing goals:
2021-05-19 22:57:46 +03:00
\begin{itemize}
\item Reduce detail by removing or simplifying "less important" features.
2021-05-19 22:57:51 +03:00
\item Retain enough detail, so the original is still recognizable.
2021-05-19 22:57:46 +03:00
\end{itemize}
2021-05-19 22:57:51 +03:00
Given the discussed complexities with natural features, a fine line between
under-simplification (leaving object as-is) and over-simplification (making a
straight line) needs to be found. Therein lies the complexity of simplification
algorithms: all have different trade-offs.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
The purpose of the thesis is to implement a cartographic line generalization
2021-05-19 22:57:51 +03:00
algorithm on the basis of {\WM} algorithm, using open-source software. Tasks:
2021-05-19 22:57:50 +03:00
\begin{itemize}
\item Evaluate existing line simplification algorithms.
2021-05-19 22:57:51 +03:00
\item Identify main river generalization problems, using classical line
2021-05-19 22:57:50 +03:00
simplification algorithms.
2021-05-19 22:57:51 +03:00
\item Define the method of the {\WM} technical implementation.
2021-05-19 22:57:50 +03:00
\item Realize {\WM} algorithm technically, explaining the geometric
transformations in detail.
\item Apply the created algorithm for different datasets and compare
2021-05-19 22:57:51 +03:00
the results with national datasets.
2021-05-19 22:57:50 +03:00
\end{itemize}
Scientific relevance of this work --- the simplification processes (steps)
2021-05-19 22:57:51 +03:00
described by the {\WM} algorithm --- are analyzed in detail, practically
implemented, and the implementation is described. That expands the knowledge of
2021-05-19 22:57:50 +03:00
cartographic theory about the generalization of natural objects' boundaries
after their natural defining properties.
In the original {\WM} article introducing the algorithm, the steps are not
2021-05-19 22:57:51 +03:00
detailed in a way that can be put into practice for specific data; the steps are
specified in this work. Practically, this work makes it possible to use open-source software to perform cartographic line generalization. The developed
2021-05-19 22:57:50 +03:00
specialized cartographic line simplification algorithm can be applied by
cartographers to implement automatic data generalization solutions. Given the
open-source nature of this work, the algorithm implementation can be modified
freely.
2021-05-19 22:57:51 +03:00
\section{Literature Review And Problematic}
2021-05-19 22:57:49 +03:00
\label{sec:literature-review-problematic}
2021-05-19 22:57:51 +03:00
\subsection{Available Algorithms}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:50 +03:00
This section reviews the classical line simplification algorithms, which,
besides being around for a long time, offer easily accessible implementations,
as well as more modern ones, which only theorize, but do not provide an
implementation.
2021-05-19 22:57:48 +03:00
\subsubsection{{\DP}, {\VW} and Chaikin's}
2021-05-19 22:57:50 +03:00
\label{sec:dp-vw-chaikin}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:48 +03:00
{\DP}\cite{douglas1973algorithms} and {\VW}\cite{visvalingam1993line} are
2021-05-19 22:57:49 +03:00
"classical" line simplification computer graphics algorithms. They are
2021-05-19 22:57:51 +03:00
relatively simple to implement and require few runtime resources. Both of them
accept a single parameter based on desired scale of the map, which makes them
2021-05-19 22:57:49 +03:00
straightforward to adjust for different scales.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
Both algorithms are available in PostGIS, a free-software GIS suite:
2021-05-19 22:57:46 +03:00
\begin{itemize}
2021-05-19 22:57:47 +03:00
\item {\DP} via
2021-05-19 22:57:50 +03:00
\href{https://postgis.net/docs/ST_Simplify.html}{PostGIS \textsc{st\_simplify}}.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:47 +03:00
\item {\VW} via
2021-05-19 22:57:50 +03:00
\href{https://postgis.net/docs/ST_SimplifyVW.html}{PostGIS
\textsc{st\_simplifyvw}}.
2021-05-19 22:57:46 +03:00
\end{itemize}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
It may be worthwhile to post-process those through Chaikin's line smoothing
algorithm\cite{chaikin1974algorithm} via
2021-05-19 22:57:47 +03:00
\href{https://postgis.net/docs/ST_ChaikinSmoothing.html}{PostGIS
2021-05-19 22:57:50 +03:00
\textsc{st\_chaikinsmoothing}}.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
In generalization examples, we will use two rivers: Šalčia and Visinčia.
These rivers were chosen because they have both large and small bends, and
thus are convenient to analyze for both small- and large-scale generalization.
2021-05-19 22:57:50 +03:00
Figure~\onpage{fig:salvis-25} illustrates the original two rivers without any
simplification.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:48 +03:00
\centering
\includegraphics[width=\textwidth]{salvis-25k}
2021-05-19 22:57:49 +03:00
\caption{Example rivers for visual tests (1:{\numprint{25000}}).}
2021-05-19 22:57:48 +03:00
\label{fig:salvis-25}
\end{figure}
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:48 +03:00
\centering
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{salvis-50k}
2021-05-19 22:57:49 +03:00
\caption{Example scaled 1:\numprint{50000}.}
2021-05-19 22:57:51 +03:00
\label{fig:salvis-50k}
2021-05-19 22:57:48 +03:00
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=.2\textwidth]{salvis-250k-10x}
2021-05-19 22:57:49 +03:00
\caption{Example scaled 1:\numprint{250000}.}
2021-05-19 22:57:48 +03:00
\end{subfigure}
2021-05-19 22:57:48 +03:00
\caption{Down-scaled original river.}
2021-05-19 22:57:48 +03:00
\label{fig:salvis-50-250}
\end{figure}
2021-05-19 22:57:51 +03:00
Same rivers, unprocessed but in higher scales (1:\numprint{50000} and
1:\numprint{250000}), are depicted in figure~\ref{fig:salvis-50-250}. Some
2021-05-19 22:57:49 +03:00
river features are so compact that a reasonably thin line depicting the river
is touching itself, creating a thicker line. We can assume that some
simplification for scale 1:\numprint{50000} and especially for
1:\numprint{250000} are worthwhile.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
\centering
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-dp-64-50k}
2021-05-19 22:57:50 +03:00
\caption{Using {\DP}.}
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-vw-64-50k}
2021-05-19 22:57:50 +03:00
\caption{Using {\VW}.}
\end{subfigure}
2021-05-19 22:57:50 +03:00
\caption{Simplified using classical algorithms (1:\numprint{50000}).}
\label{fig:salvis-generalized-50k}
\end{figure}
2021-05-19 22:57:51 +03:00
Figure~\ref{fig:salvis-generalized-50k} illustrates the same river bend, but
2021-05-19 22:57:49 +03:00
simplified using {\DP} and {\VW} algorithms. The resulting lines are jagged,
2021-05-19 22:57:51 +03:00
and thus the resulting line looks unlike a real river. To smoothen the jaggedness,
traditionally, Chaikin's\cite{chaikin1974algorithm} is applied after
2021-05-19 22:57:51 +03:00
generalization, illustrated in figure~\ref{fig:salvis-generalized-chaikin-50k}.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht!]
\centering
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-dp-64-chaikin-50k}
2021-05-19 22:57:50 +03:00
\caption{{\DP} and Chaikin's.}
2021-05-19 22:57:51 +03:00
\label{fig:salvis-dp-64-chaikin-50k}
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-vw-64-chaikin-50k}
2021-05-19 22:57:50 +03:00
\caption{{\VW} and Chaikin's.}
2021-05-19 22:57:51 +03:00
\label{fig:salvis-vw-64-chaikin-50k}
\end{subfigure}
2021-05-19 22:57:50 +03:00
\caption{Simplified and smoothened river (1:\numprint{50000}).}
\label{fig:salvis-generalized-chaikin-50k}
\end{figure}
2021-05-19 22:57:50 +03:00
\begin{figure}[ht!]
2021-05-19 22:57:48 +03:00
\centering
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-overlaid-dp-64-chaikin-50k}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\caption{Original (fig.~\ref{fig:salvis-50k}) and simplified
2021-05-19 22:57:51 +03:00
(fig.~\ref{fig:salvis-dp-64-chaikin-50k}).}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:48 +03:00
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-overlaid-vw-64-chaikin-50k}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\caption{Original (fig.~\ref{fig:salvis-50k}) and simplified
2021-05-19 22:57:51 +03:00
(fig.~\ref{fig:salvis-vw-64-chaikin-50k}.)}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:48 +03:00
\end{subfigure}
2021-05-19 22:57:50 +03:00
\caption{Zoomed-in simplified and smoothened river and original.}
2021-05-19 22:57:48 +03:00
\label{fig:salvis-overlaid-generalized-chaikin-50k}
\end{figure}
2021-05-19 22:57:50 +03:00
\begin{figure}[b!]
\centering
\includegraphics[width=.9\textwidth]{amalgamate1}
2021-05-19 22:57:51 +03:00
\caption{Narrow bends amalgamating into thick unintelligible blobs.}
2021-05-19 22:57:50 +03:00
\label{fig:pixel-amalgamation}
\end{figure}
2021-05-19 22:57:50 +03:00
The resulting simplified and smoothened example
2021-05-19 22:57:48 +03:00
(figure~\onpage{fig:salvis-generalized-chaikin-50k}) yields a more
2021-05-19 22:57:51 +03:00
aesthetically pleasing result; however, it obscures natural river features.
2021-05-19 22:57:48 +03:00
Given the absence of rocks, the only natural features that influence the river
direction are topographic:
2021-05-19 22:57:48 +03:00
\begin{itemize}
2021-05-19 22:57:48 +03:00
\item Relatively straight river (completely straight or with small-angled
bends over a relatively long distance) implies greater slope, more
water, and/or faster flow.
2021-05-19 22:57:48 +03:00
\item Bendy river, on the contrary, implies slower flow, slighter slope,
2021-05-19 22:57:48 +03:00
and/or less water.
2021-05-19 22:57:48 +03:00
\end{itemize}
2021-05-19 22:57:51 +03:00
Both {\VW} and {\DP} have a tendency to remove the small bends altogether,
removing a valuable characterization of the river.
2021-05-19 22:57:48 +03:00
Sometimes low-water rivers in slender slopes have many bends next to each
other. In low resolutions (either in small-DPI screens or paper, or when the
river is sufficiently zoomed out, or both), the small bends will amalgamate to
2021-05-19 22:57:51 +03:00
a unintelligible blob. Figure~\ref{fig:pixel-amalgamation} illustrates a
real-world example where a bendy river, normally 1 or 2 pixels wide, creates a
wide area, of which the shapes of the bend become unintelligible. In this
example, classical algorithms would remove these bends altogether. A
cartographer would retain a few of those distinctive bends, but would increase
the distance between the bends, remove some of the bends, or both.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
% TODO: figues shouldn't split the sentence.
2021-05-19 22:57:48 +03:00
For the reasons discussed in this section, the "classical" {\DP} and {\VW} are
2021-05-19 22:57:51 +03:00
not well-suited for natural river generalization, and a more robust line
generalization algorithm is worthwhile to look for.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\subsubsection{Modern Approaches}
2021-05-19 22:57:46 +03:00
Due to their simplicity and ubiquity, {\DP} and {\VW} have been established as
2021-05-19 22:57:46 +03:00
go-to algorithms for line generalization. During recent years, alternatives
have emerged. These modern replacements fall into roughly two categories:
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:46 +03:00
\begin{itemize}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:46 +03:00
\item Cartographic knowledge was encoded to an algorithm (bottom-up
2021-05-19 22:57:47 +03:00
approach). One among these are \titlecite{wang1998line}, also known
as {\WM}'s algorithm.
2021-05-19 22:57:46 +03:00
\item Mathematical shape transformation which yields a more cartographic
2021-05-19 22:57:49 +03:00
result. E.g., \titlecite{jiang2003line},
2021-05-19 22:57:47 +03:00
\titlecite{dyken2009simultaneous}, \titlecite{mustafa2006dynamic},
2021-05-19 22:57:51 +03:00
\titlecite{nollenburg2008morphing}, \titlecite{devangleserrorbends}.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:46 +03:00
\end{itemize}
2021-05-19 22:57:46 +03:00
Authors of most of the aforementioned articles have implemented the
2021-05-19 22:57:48 +03:00
generalization algorithm, at least to generate the illustrations in the
2021-05-19 22:57:51 +03:00
articles. However, code is not available for evaluation with a desired dataset, much less for use as a basis for creating new maps. To the author's knowledge,
2021-05-19 22:57:48 +03:00
{\WM}\cite{wang1998line} is available in a commercial product, but requires a
purchase of the commercial product suite, without a way to license the
standalone algorithm.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:50 +03:00
{\WM} algorithm was created by encoding professional cartographers' knowledge
into a computer algorithm. It has a few main properties which make it
especially suitable for generalization of natural linear features:
2021-05-19 22:57:51 +03:00
\begin{figure}[b]
\centering
\includegraphics[width=.8\textwidth]{wang125}
2021-05-19 22:57:51 +03:00
\caption{Figure 12.5 in \cite{wang1998line}: example of cartographic line
2021-05-19 22:57:51 +03:00
generalization.}
\label{fig:wang125}
\end{figure}
2021-05-19 22:57:50 +03:00
\begin{itemize}
2021-05-19 22:57:51 +03:00
\item Small bends are not always removed, but either combined (e.g.,
2021-05-19 22:57:50 +03:00
3 bends into 2), exaggerated, or removed, depending on the neighboring
bends.
\item Long and gentle bends are not straightened, but kept as-is.
\end{itemize}
As a result of these properties, {\WM} algorithm retains the defining
2021-05-19 22:57:51 +03:00
properties of the natural features: high-current rivers keep their appearance
2021-05-19 22:57:50 +03:00
as such, instead of becoming canals; low-stream bendy rivers retain their
frequent small bends.
2021-05-19 22:57:51 +03:00
Figure~\ref{fig:wang125}, sub-figure labeled "proposed method" (from the
original \titlecite{wang1998line}) illustrates the {\WM} algorithm.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Problematic with Generalization of Rivers}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:50 +03:00
This section introduces the reader to simplification and generalization, and
2021-05-19 22:57:51 +03:00
discusses two main problems with current-day automatic cartographic line
generalization:
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:50 +03:00
\begin{itemize}
\item Currently available line simplification algorithms were created
2021-05-19 22:57:51 +03:00
to simplify geometries, but do not encode cartographic knowledge.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:50 +03:00
\item Existing cartographic line generalization algorithms are not freely
accessible.
\end{itemize}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:50 +03:00
\subsubsection{Simplification versus Generalization}
2021-05-19 22:57:50 +03:00
It is important to note the distinction between simplification, line
2021-05-19 22:57:51 +03:00
generalization, and cartographic generalization.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
Simplification reduces an object's detail in isolation, not taking the object's
2021-05-19 22:57:50 +03:00
natural properties or surrounding objects into account. For example, if a
river is simplified, it may have an approximate shape of the original river,
but lose some shapes that define it. For example:
\begin{itemize}
\item Low-water rivers in slender slopes have many small bends next to each
other. A non-cartographic line simplification may remove all of them,
thus losing an important river's characteristic feature: after such
simplification, it will be hard to tell that the original river was
low-water in a slender slope.
\item Low-angle river bend river over a long distance differs significantly
from a completely straight canal. Non-cartographic line simplification
2021-05-19 22:57:51 +03:00
may replace that bend with a straight line, making the river more
2021-05-19 22:57:50 +03:00
similar to a canal than a river.
\end{itemize}
2021-05-19 22:57:51 +03:00
In other words, simplification processes the line, ignoring its geographic
features. It works well when the features are human-made (e.g., roads,
2021-05-19 22:57:50 +03:00
administrative boundaries, buildings). There is a number of freely available
non-cartographic line simplification algorithms, which this paper will review.
2021-05-19 22:57:51 +03:00
Contrary to line simplification, cartographic generalization does not focus
2021-05-19 22:57:50 +03:00
into a single feature class (e.g., rivers), but the whole map. For example,
line simplification may change river bends in a way that bridges (and roads to
the bridges) become misplaced. While line simplification is limited to a single
feature class, cartographic generalization is not. Fully automatic cartographic
2021-05-19 22:57:51 +03:00
generalization is not yet a solved problem. % <TODO: Reference needed>.
2021-05-19 22:57:50 +03:00
Cartographic line generalization falls in between the two: it does more than
line simplification, and less than cartographic generalization. Cartographic
2021-05-19 22:57:51 +03:00
line generalization deals with a single feature class, takes into account its
geographic properties, but ignores other features. This paper examines {\WM}'s
2021-05-19 22:57:50 +03:00
\titlecite{wang1998line}, a cartographic line generalization algorithm.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\subsubsection{Availability of Generalization Algorithms}
2021-05-19 22:57:50 +03:00
Lack of robust openly available generalization algorithm implementations poses
2021-05-19 22:57:51 +03:00
a problem for map creation with free software: there is no high-quality
simplification algorithm to create down-scaled maps, so any cartographic work,
which uses line generalization as part of its processing, will be of sub-par
quality. We believe that availability of high-quality open-source tools is an
important foundation for future cartographic experimentation and development,
2021-05-19 22:57:51 +03:00
thus it benefits the cartographic society as a whole.
2021-05-19 22:57:50 +03:00
{\WM}'s commercial availability signals something about the value of the
algorithm: at least the authors of the commercial software suite deemed it
worthwhile to include it. However, not everyone has access to the commercial
software suite, access to funds to buy the commercial suite, or access to the
operating system required to run the commercial suite. PostGIS, in contrast, is
2021-05-19 22:57:51 +03:00
free itself, and runs on free platforms. Therefore, algorithm
2021-05-19 22:57:50 +03:00
implementations that run on PostGIS or other free platforms are useful to a
wider cartographic society than proprietary ones.
2021-05-19 22:57:51 +03:00
\subsubsection{Unfitness of Line Simplification Algorithms}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
Section~\ref{sec:dp-vw-chaikin} illustrates the current gaps with line
simplification algorithms for real rivers. To sum up, we highlight the
2021-05-19 22:57:50 +03:00
following cartographic problems from our examples:
\begin{description}
2021-05-19 22:57:51 +03:00
\item[Long bends] should remain as long bends, instead of becoming fully
2021-05-19 22:57:50 +03:00
straight lines.
2021-05-19 22:57:51 +03:00
\item[Many small bends] should not be removed. To retain a river's character,
2021-05-19 22:57:50 +03:00
the algorithm should retain some small bends, and, when they are too
2021-05-19 22:57:51 +03:00
small to be visible, they should be combined or exaggerated.
2021-05-19 22:57:50 +03:00
\end{description}
2021-05-19 22:57:50 +03:00
We are limiting the problem to cartographic line generalization. That is, full
2021-05-19 22:57:50 +03:00
cartographic generalization, which takes topology and other feature classes
into account, is out of scope.
2021-05-19 22:57:50 +03:00
Figure~\onpage{fig:wang125} illustrates {\WM} algorithm from their original
2021-05-19 22:57:51 +03:00
paper. Note how the long bends retain curvy, and how some small bends get
2021-05-19 22:57:50 +03:00
exaggerated.
2021-05-19 22:57:47 +03:00
\section{Methodology}
2021-05-19 22:57:46 +03:00
\label{sec:methodology}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:47 +03:00
The original {\WM}'s algorithm \cite{wang1998line} leaves something to be
desired for a practical implementation: it is not straightforward to implement
the algorithm from the paper alone.
2021-05-19 22:57:46 +03:00
Explanations in this document are meant to expand, rather than substitute, the
2021-05-19 22:57:51 +03:00
original description in {\WM}. Therefore, familiarity with the original paper is
2021-05-19 22:57:48 +03:00
assumed, and, for some sections, having the original close-by is necessary to
2021-05-19 22:57:47 +03:00
meaningfully follow this document.
2021-05-19 22:57:48 +03:00
This paper describes {\WM} in detail that is more useful for anyone who wishes
to follow the algorithm implementation more closely: each section is expanded
2021-05-19 22:57:51 +03:00
with additional commentary, and illustrations added for non-obvious steps. Corner
cases are discussed, too.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:48 +03:00
Assume Euclidean geometry throughout this document, unless noted otherwise.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Main Geometry Elements Used by Algorithm}
2021-05-19 22:57:48 +03:00
\label{sec:vocab}
2021-05-19 22:57:50 +03:00
This section defines and explains the geometry elements that are used
throughout this paper and the implementation.
\begin{description}
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{vertex}] is a point on a plane, can be expressed
by a pair of $(x,y)$ coordinates.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{line segment}] or \textsc{segment} joins two
vertices by a straight line. A segment can be expressed by two
2021-05-19 22:57:51 +03:00
coordinate pairs: $(x_1, y_1)$ and $(x_2, y_2)$. Line segment and
segment are used interchangeably.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{line}] or \textsc{linestring} represents a single
2021-05-19 22:57:51 +03:00
linear feature. For example, a river or a coastline.
2021-05-19 22:57:51 +03:00
Geometrically, a line is a series of connected line segments, or,
equivalently, a series of connected vertices. Each vertex connects to
2021-05-19 22:57:51 +03:00
two other vertices, with the exception of the vertices at either ends of the line:
these two connect to a single other vertex.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{multiline}] or \textsc{multilinestring} is a
2021-05-19 22:57:51 +03:00
collection of linear features. Throughout this implementation, this is
used rarely (normally, a river is a single line) but can be valid
2021-05-19 22:57:51 +03:00
when, for example, a river has an island.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{bend}] is a subset of a line that humans perceive
as a curve. The geometric definition is complex and is discussed in
2021-05-19 22:57:48 +03:00
section~\ref{sec:definition-of-a-bend}.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{baseline}] is a line between the bend's first and last
2021-05-19 22:57:51 +03:00
vertices.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{sum of inner angles}] is a measure of how "curved"
2021-05-19 22:57:51 +03:00
the bend is. Assume that first and last bend vertices are vectors. Then sum
2021-05-19 22:57:51 +03:00
of inner angles will be the angular difference of those two vectors.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{algorithmic complexity}] measured in \textsc{big o
notation}, is a relative measure that helps explain how
2021-05-19 22:57:51 +03:00
long\footnote{the upper bound, i.e., the worst case.} the
algorithm will run depending on its input. It is widely used in computing
2021-05-19 22:57:51 +03:00
science when discussing the efficiency of a given algorithm.
2021-05-19 22:57:48 +03:00
For example, given $n$ objects and time complexity of $O(log(n))$, the
time it takes to execute the algorithm is logarithmic to $n$.
Conversely, if complexity is $O(n^2)$, then the time it takes to
2021-05-19 22:57:50 +03:00
execute the algorithm grows quadratically with input. Importantly, if
the input size doubles, the time it takes to run the algorithm
2021-05-19 22:57:48 +03:00
quadruples.
2021-05-19 22:57:51 +03:00
\textsc{big o notation} was first suggested by
2021-05-19 22:57:50 +03:00
Bachmann\cite{bachmann1894analytische} and Landau\cite{landau1911} in
2021-05-19 22:57:51 +03:00
late \textsc{xix} century, and clarified and popularized for computing
science by Donald Knuth\cite{knuth1976big} in the 1970s.
2021-05-19 22:57:48 +03:00
\end{description}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Algorithm Implementation Process}
2021-05-19 22:57:51 +03:00
\label{sec:algorithm-implementation-process}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:50 +03:00
\tikzset{
startstop/.style={trapezium,text centered,minimum height=2em,
trapezium left angle=70,trapezium right angle=110,draw=black,fill=red!20},
proc/.style={rectangle,minimum height=2em,text centered,draw=black,
fill=orange!20},
decision/.style={diamond,minimum height=2em,text centered,aspect=3,
draw=black,fill=green!20},
arrow/.style={thick,->,>=stealth},
}
2021-05-19 22:57:50 +03:00
\begin{figure}[!ht]
2021-05-19 22:57:50 +03:00
\centering
2021-05-19 22:57:50 +03:00
\begin{tikzpicture}[node distance=1.5cm,auto]
2021-05-19 22:57:50 +03:00
\node (start) [startstop] {Read \textsc{linestring}};
\node (detect) [proc,below of=start] {Detect bends};
\node (inflections) [proc,below of=detect] {Fix gentle inflections};
\node (selfcrossing) [proc,below of=inflections] {Eliminate self-crossing};
\node (mutated1) [decision,below of=selfcrossing] {Mutated?};
\node (bendattrs) [proc,below of=mutated1] {Compute bend attributes};
\node (exaggeration) [proc,below of=bendattrs] {Exaggeration};
\node (mutated2) [decision,below of=exaggeration] {Mutated?};
2021-05-19 22:57:50 +03:00
\node (elimination) [proc,below of=mutated2] {Elimination};
\node (mutated3) [decision,below of=elimination] {Mutated?};
\node (stop) [startstop,below of=mutated3] {Stop};
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:50 +03:00
\coordinate [right of=mutated1,node distance=5cm] (mutated1y) {};
\coordinate [right of=mutated2,node distance=5cm] (mutated2y) {};
\coordinate [right of=mutated3,node distance=5cm] (mutated3y) {};
2021-05-19 22:57:50 +03:00
\draw [arrow] (start) -- (detect);
\draw [arrow] (detect) -- (inflections);
\draw [arrow] (inflections) -- (selfcrossing);
\draw [arrow] (selfcrossing) -- (mutated1);
2021-05-19 22:57:50 +03:00
\draw [arrow] (mutated1) -| node [near start] {Yes} (mutated1y) |- (detect);
2021-05-19 22:57:50 +03:00
\draw [arrow] (mutated1) -- node[anchor=west] {No} (bendattrs);
\draw [arrow] (bendattrs) -- (exaggeration);
2021-05-19 22:57:50 +03:00
\draw [arrow] (exaggeration) -- (mutated2);
\draw [arrow] (mutated2) -| node [near start] {Yes} (mutated2y) |- (detect);
\draw [arrow] (mutated2) -- node[anchor=west] {No} (elimination);
\draw [arrow] (mutated3) -| node [near start] {Yes} (mutated3y) |- (detect);
\draw [arrow] (mutated3) -- node[anchor=west] {No} (stop);
\draw [arrow] (elimination) -- (mutated3);
2021-05-19 22:57:50 +03:00
\end{tikzpicture}
2021-05-19 22:57:50 +03:00
\caption{Flow chart of the implementation workflow.}
\label{fig:flow-chart}
2021-05-19 22:57:50 +03:00
\end{figure}
2021-05-19 22:57:50 +03:00
Figure~\ref{fig:flow-chart} visualizes the algorithm steps for each line.
\textsc{multilinestring} features are split to \textsc{linestring} features and
executed in order.
2021-05-19 22:57:50 +03:00
Judging from {\WM} prototype flow chart (depicted in figure 11 of the original
2021-05-19 22:57:50 +03:00
paper), their approach is iterative for the line: it will process the line in
sequence, doing all steps, before moving on to the next step. We will call this
approach "streaming", because it does not require to have the full line to
process it.
We have taken a different approach: process each step fully for the line,
before moving to the next step. This way provides the following advantages:
\begin{itemize}
2021-05-19 22:57:51 +03:00
\item For \textsc{eliminate self-crossing} stage, when it finds a bend with
the right sum of inflection angles, it checks the whole line for
self-crossings. This is impossible with streaming because it requires
having the full line in memory. It could be optimized by, for example,
looking for a fixed number of neighboring bends (say, 10), but that
would complicate the implementation.
2021-05-19 22:57:50 +03:00
\item \textsc{fix gentle inflections} is iterating the same line twice from
opposite directions. That could be re-written to streaming fashion, but
2021-05-19 22:57:51 +03:00
it complicates the implementation, too.
2021-05-19 22:57:50 +03:00
\end{itemize}
On the other hand, comparing to the {\WM} prototype flow chart, our
implementation uses more memory (because it needs to have the full line before
processing), and some steps are unnecessarily repeated, like re-computing the
2021-05-19 22:57:51 +03:00
bend's attributes during repeated iterations.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Technical Implementation}
2021-05-19 22:57:50 +03:00
\label{sec:technical-implementation}
Technical algorithm realization was created in \titlecite{postgis311}. PostGIS
is a PostgreSQL extension for working with spatial data.
PostgreSQL is an open-source relational database, widely used in industry and
2021-05-19 22:57:51 +03:00
academia. PostgreSQL can be interfaced from nearly any programming language;
therefore, solutions written in PostgreSQL (and their extensions) are usable in
many environments. On top of that, PostGIS implements a rich set of
2021-05-19 22:57:50 +03:00
functions\cite{postgisref} for working with geometric and geographic objects.
2021-05-19 22:57:51 +03:00
Due to its wide applicability and rich library of spatial functions, PostGIS is
the implementation language of the {\WM} algorithm. The implementation exposes
the entrypoint function \textsc{st\_simplifywm}:
2021-05-19 22:57:50 +03:00
\begin{minted}[fontsize=\small]{sql}
create function ST_SimplifyWM(
geom geometry,
dhalfcircle float,
intersect_patience integer default 10,
dbgname text default null
) returns geometry
\end{minted}
This function accepts the following parameters:
\begin{description}
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{geom}] is the input geometry. Either
2021-05-19 22:57:50 +03:00
\textsc{linestring} or \textsc{multilinestring}.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{dhalfcircle}] is the diameter of the half-circle.
2021-05-19 22:57:50 +03:00
Explained in section~\ref{sec:bend-scaling-and-dimensions}.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{intersect\_patience}] is an optional parameter to
2021-05-19 22:57:50 +03:00
exaggeration operator, explained in
section~\ref{sec:exaggeration-operator}.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{dbgname}] is an optional human-readable name of
2021-05-19 22:57:50 +03:00
the figure. Explained in section~\ref{sec:debugging}.
\end{description}
2021-05-19 22:57:51 +03:00
The function \textsc{st\_simplifywm} calls into helper functions, which detect,
2021-05-19 22:57:51 +03:00
transform, or remove bends. These helper functions are also defined in the
2021-05-19 22:57:51 +03:00
implementation and are part of the algorithm technical realization. All
supporting functions use spatial manipulation functions provided by PostGIS.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Automated Tests}
2021-05-19 22:57:50 +03:00
\label{sec:automated-tests}
As part of the algorithm realization, an automated test suite has been
2021-05-19 22:57:51 +03:00
developed. Shapes to test each function have been hand-crafted, and expected
2021-05-19 22:57:50 +03:00
results have been manually calculated. The test suite executes parts of the
algorithm against a predefined set of geometries, and asserts that the output
2021-05-19 22:57:51 +03:00
matches the resulting hand-calculated geometries.
2021-05-19 22:57:50 +03:00
The full set of test geometries is visualized in figure~\ref{fig:test-figures}.
\begin{figure}[ht]
\centering
\includegraphics[width=\textwidth]{test-figures}
\caption{Geometries for automated test cases.}
\label{fig:test-figures}
\end{figure}
2021-05-19 22:57:51 +03:00
Test suite can be executed with a single command and completes in about a
2021-05-19 22:57:51 +03:00
second. Having an easily accessible test suite boosts confidence that no
2021-05-19 22:57:50 +03:00
unexpected bugs have snug in while modifying the algorithm.
2021-05-19 22:57:51 +03:00
We will explain two instances when automated tests were very useful during
2021-05-19 22:57:50 +03:00
the implementation:
\begin{itemize}
\item Created a function \textsc{wm\_exaggeration}, which exaggerates bends
2021-05-19 22:57:51 +03:00
following the rules. It worked well over simple geometries but, due to
2021-05-19 22:57:51 +03:00
a subtle bug, created a self-crossing bend in Visinčia. The offending
bend was copied to the automated test suite, which helped fix the bug.
2021-05-19 22:57:51 +03:00
Now the test suite contains the same bend (a hook-like bend on the
2021-05-19 22:57:51 +03:00
right-hand side of figure~\ref{fig:test-figures}) and code to verify
that it was correctly exaggerated.
\item During algorithm development, automated tests run about once a
2021-05-19 22:57:50 +03:00
minute. They quickly find logical and syntax errors. In contrast,
2021-05-19 22:57:51 +03:00
running the algorithm with real rivers takes a few minutes, which
2021-05-19 22:57:50 +03:00
increases the feedback loop, and takes longer to fix the "simple"
errors.
\end{itemize}
2021-05-19 22:57:51 +03:00
Whenever we find and fix a bug, we aim to create an automated test case for it,
2021-05-19 22:57:50 +03:00
so the same bug is not re-introduced by whoever works next on the same piece of
code.
Besides testing for specific cases, an automated test suite ensures future
stability and longevity of the implementation itself: when new contributors
start changing code, they have higher assurance they have not broken
already-working code.
\subsection{Reproducibility}
\label{sec:reproducing-the-paper}
It is widely believed that the ability to reproduce the results of a published
study is important to the scientific community. In practice, however, it is
2021-05-19 22:57:51 +03:00
often hard or impossible: research methodologies, as well as algorithms
2021-05-19 22:57:50 +03:00
themselves, are explained in prose, which, due to the nature of the non-machine
language, lends itself to inexact interpretations.
This article, besides explaining the algorithm in prose, includes the program
of the algorithm in a way that can be executed on reader's workstation. On top
2021-05-19 22:57:51 +03:00
of it, all the illustrations in this paper are generated using that algorithm
2021-05-19 22:57:51 +03:00
from a predefined list of test geometries (see
2021-05-19 22:57:50 +03:00
section~\ref{sec:automated-tests}).
2021-05-19 22:57:51 +03:00
This article and accompanying code are accessible on GitHub as of 2021-05-21
2021-05-19 22:57:51 +03:00
\cite{wmsql}.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
Instructions how to re-generate all the visualizations are in
2021-05-19 22:57:50 +03:00
appendix~\ref{sec:code-regenerate}. The visualization code serves as a good
example reference for anyone willing to start using the algorithm.
2021-05-19 22:57:51 +03:00
\section{Algorithm Implementation}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:48 +03:00
Like alluded in section~\ref{sec:introduction}, {\WM} paper skims over
2021-05-19 22:57:51 +03:00
certain details which are important to implement the algorithm. This section
2021-05-19 22:57:47 +03:00
goes through each algorithm stage, illustrating the intermediate steps and
explaining the author's desiderata for a more detailed description.
2021-05-19 22:57:47 +03:00
Illustrations of the following sections are extracted from the automated test
2021-05-19 22:57:51 +03:00
cases which were written during the algorithm implementation (as discussed in
2021-05-19 22:57:51 +03:00
section~\ref{sec:automated-tests}).
2021-05-19 22:57:48 +03:00
\subsection{Debugging}
2021-05-19 22:57:50 +03:00
\label{sec:debugging}
2021-05-19 22:57:51 +03:00
This implementation includes debugging facilities in a form of a table
\textsc{wm\_debug}. The table's schema is written in
2021-05-19 22:57:51 +03:00
listing~\ref{lst:wm-debug-sql}.
\begin{listing}[h]
\begin{minted}[fontsize=\small]{sql}
drop table if exists wm_debug;
create table wm_debug(
id serial,
stage text not null,
name text not null,
gen bigint not null,
nbend bigint,
way geometry,
props jsonb
);
\end{minted}
\caption{\textsc{wm\_debug} table definition}
\label{lst:wm-debug-sql}
\end{listing}
When debug mode is active, implementation steps will store their results, which
2021-05-19 22:57:51 +03:00
can be useful to manually inspect the results of intermediate actions. Besides
2021-05-19 22:57:51 +03:00
manual inspection, most of the figure illustrations in this article are
2021-05-19 22:57:51 +03:00
visualized from the \textsc{wm\_debug} table. Debugging mode can be activated
by passing a non-empty \textsc{dbgname} string to the function
\textsc{st\_simplifywm} (this function was described in
section~\ref{sec:technical-implementation}). By convention, \textsc{dbgname} is
2021-05-19 22:57:51 +03:00
the name of the geometry that is being simplified, e.g., \textsc{šalčia}. The
purpose of each column in \textsc{wm\_debug} is described below:
2021-05-19 22:57:51 +03:00
\begin{description}
\item[\normalfont\textsc{id}] is a unique identifier for each feature.
Generated automatically by PostgreSQL. Useful when it is necessary to
2021-05-19 22:57:51 +03:00
copy one or more features to a separate table for unit tests, as
2021-05-19 22:57:51 +03:00
described in section~\ref{sec:automated-tests}.
\item[\normalfont\textsc{stage}] is the stage of the algorithm. As of
writing, there are a few:
\begin{description}
\item[\normalfont\textsc{afigures}] at the beginning of the loop.
\item[\normalfont\textsc{bbends}] after bends are detected.
\item[\normalfont\textsc{cinflections}] after gentle inflections
are fixed.
\item[\normalfont\textsc{dcrossings}] after self-crossings are
eliminated.
\item[\normalfont\textsc{ebendattrs}] after bend attributes are
calculated.
\item[\normalfont\textsc{gexaggeration}] after bends have been
exaggerated.
\item[\normalfont\textsc{helimination}] after bends have been
eliminated.
\end{description}
2021-05-19 22:57:51 +03:00
Some of these have sub-stages which are encoded by a dash and a
2021-05-19 22:57:51 +03:00
sub-stage name, e.g., \textsc{bbends-polygon} creates polygon
geometries after polygons have been detected; this particular example
is used to generate colored polygons in
figure~\ref{fig:fig8-definition-of-a-bend}.
2021-05-19 22:57:51 +03:00
\item[\normalfont\textsc{name}] is the name of the geometry, which comes from
2021-05-19 22:57:51 +03:00
parameter~\textsc{dbgname}.
\item[\normalfont\textsc{gen}] is the top-level iteration number. In other
2021-05-19 22:57:51 +03:00
words, the number of times the execution flow passes through
2021-05-19 22:57:51 +03:00
\textsc{detect bends} phase as depicted in
figure~\onpage{fig:flow-chart}.
\item[\normalfont\textsc{nbend}] is the bend's index in its \textsc{line}.
\item[\normalfont\textsc{way}] is the geometry column.
\item[\normalfont\textsc{props}] is a free-form JSON object to store
miscellaneous values. For example, \textsc{ebendattrs} phase stores a
boolean property \textsc{isolated}, which signifies whether the bend is
2021-05-19 22:57:51 +03:00
isolated or not (explained in section~\ref{sec:isolated-bend}).
2021-05-19 22:57:51 +03:00
\end{description}
When debug mode is turned off (that is, \textsc{dbgname} is left unspecified),
\textsc{wm\_debug} is empty and the algorithm runs slightly faster.
2021-05-19 22:57:51 +03:00
\subsection{Merging Pieces of the River into One}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
Example river geometries were sourced from OpenStreetMap\cite{openstreetmap}
and NŽT\cite{nzt}. Rivers in both data sources are stored in shorter line
segments, and multiple segments (usually hundreds or thousands for significant
rivers) define one full river. While it is convenient to store and edit, these
segments are not explicitly related to each other. This poses a problem for
2021-05-19 22:57:51 +03:00
simplification algorithms which manipulate on full linear features at a time:
2021-05-19 22:57:51 +03:00
full river geometries, but not their parts.
Since these rivers do not have an explicit relationship to connect them
together, they were connected using heuristics: if two line segments share a
name and are within 500 meters from each other, then they form a single river.
2021-05-19 22:57:51 +03:00
For all line simplification algorithms, all rivers need to be combined and
2021-05-19 22:57:51 +03:00
this way proved to be reasonably effective. Source code for this operation can
2021-05-19 22:57:51 +03:00
be found in listing~\onpage{lst:aggregate-rivers.sql}.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Bend Scaling And Dimensions}
2021-05-19 22:57:49 +03:00
\label{sec:bend-scaling-and-dimensions}
{\WM} accepts a single input parameter: the diameter of a half-circle. If the
2021-05-19 22:57:51 +03:00
bend's adjusted size (explained in detail in section~\ref{sec:shape-of-a-bend})
is greater than the area of the half-circle, then the bend will be left
untouched. If the bend's adjusted size is smaller than the area of the provided
2021-05-19 22:57:51 +03:00
half-circle, the bend will be simplified: either exaggerated, combined, or
2021-05-19 22:57:51 +03:00
eliminated.
The extent of line simplification, as well as the half-circle's diameter,
depends on the desired target scale. Simplification should be more aggressive
2021-05-19 22:57:51 +03:00
for smaller target scales and less aggressive for larger scales. This section
2021-05-19 22:57:51 +03:00
goes through the process of finding the correct variable to {\WM} algorithm.
2021-05-19 22:57:51 +03:00
What is the minimal, but still eligible, figure that should be displayed on
2021-05-19 22:57:49 +03:00
the map?
According to \titlecite{cartoucheMinimalDimensions}, the map is typically held
2021-05-19 22:57:51 +03:00
at a distance of 30 cm. Recommended minimum symbol size, given viewing distance
of 45 cm (1.5 feet), is 1.5 mm, as analyzed in \titlecite{mappingunits}.
2021-05-19 22:57:49 +03:00
2021-05-19 22:57:51 +03:00
In our case, our target is line bend, rather than a symbol. Assume 1.5 mm is a
diameter of the bend. A semi-circle of 1.5 mm diameter is depicted in
2021-05-19 22:57:51 +03:00
figure~\ref{fig:half-circle}. A bend of this size or larger, when adjusted to
scale, will not be simplified.
2021-05-19 22:57:49 +03:00
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:49 +03:00
\centering
\begin{tikzpicture}[x=1mm,y=1mm]
\draw[] (-10, 0) -- (-.75,0) arc (225:-45:.75) -- (10, 0);
\end{tikzpicture}
2021-05-19 22:57:50 +03:00
\caption{Smallest feature that will be not simplified (to scale).}
2021-05-19 22:57:49 +03:00
\label{fig:half-circle}
\end{figure}
{\WM} algorithm does not have a notion of scale, but it does have a notion of
2021-05-19 22:57:49 +03:00
distance: it accepts a single parameter $D$, the half-circle's diameter.
Assuming measurement units in projected coordinate system are meters (for
2021-05-19 22:57:51 +03:00
example, \titlecite{epsg3857}), some popular scales are highlighted in
2021-05-19 22:57:49 +03:00
table~\ref{table:scale-halfcirlce-diameter}.
2021-05-19 22:57:50 +03:00
\begin{table}[ht]
2021-05-19 22:57:49 +03:00
\centering
2021-05-19 22:57:50 +03:00
\begin{tabular}{ c D{.}{.}{1} }
Scale & \multicolumn{1}{c}{$D(m)$} \\ \hline
1:\numprint{10000} & 15 \\
1:\numprint{15000} & 22.5 \\
1:\numprint{25000} & 37.5 \\
1:\numprint{50000} & 75 \\
1:\numprint{250000} & 220 \\
2021-05-19 22:57:49 +03:00
\end{tabular}
2021-05-19 22:57:49 +03:00
\caption{{\WM} half-circle diameter $D$ for popular scales.}
2021-05-19 22:57:49 +03:00
\label{table:scale-halfcirlce-diameter}
\end{table}
2021-05-19 22:57:49 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Definition of a Bend}
2021-05-19 22:57:47 +03:00
\label{sec:definition-of-a-bend}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:47 +03:00
The original article describes a bend as:
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
\begin{displaycquote}{wang1998line}
2021-05-19 22:57:47 +03:00
A bend can be defined as that part of a line which contains a number of
subsequent vertices, with the inflection angles on all vertices included in
the bend being either positive or negative and the inflection of the bend's
two end vertices being in opposite signs.
2021-05-19 22:57:47 +03:00
\end{displaycquote}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
While it gives a good intuitive understanding of what the bend is, this section
provides more technical details. Here are some non-obvious characteristics that
are necessary when writing code to detect the bends:
2021-05-19 22:57:47 +03:00
\begin{itemize}
\item End segments of each line should also belong to bends. That way, all
segments belong to 1 or 2 bends.
\item First and last segments of each bend (except for the two end-line
2021-05-19 22:57:48 +03:00
segments) are also the first vertex of the next bend.
2021-05-19 22:57:47 +03:00
\end{itemize}
2021-05-19 22:57:51 +03:00
Figure~\ref{fig:fig8-definition-of-a-bend} illustrates the article's figure 8,
2021-05-19 22:57:47 +03:00
but with bends colored as polygons: each color is a distinctive bend.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:47 +03:00
\centering
2021-05-19 22:57:48 +03:00
\includegraphics[width=\textwidth]{fig8-definition-of-a-bend}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\caption{Similar to figure 8 in \cite{wang1998line}: detected bends are
2021-05-19 22:57:51 +03:00
highlighted.}
2021-05-19 22:57:47 +03:00
\label{fig:fig8-definition-of-a-bend}
\end{figure}
2021-05-19 22:57:51 +03:00
\subsection{Gentle Inflection at the End of a Bend}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:47 +03:00
The gist of the section is in the original article:
2021-05-19 22:57:47 +03:00
\begin{displaycquote}{wang1998line}
2021-05-19 22:57:47 +03:00
But if the inflection that marks the end of a bend is quite small, people
would not recognize this as the bend point of a bend
2021-05-19 22:57:47 +03:00
\end{displaycquote}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
Figure~\ref{fig:fig5-gentle-inflection} visualizes the original paper's figure 5,
when a single vertex is moved outwards the end of the bend.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
\centering
2021-05-19 22:57:48 +03:00
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{fig5-gentle-inflection-before}
2021-05-19 22:57:48 +03:00
\caption{Before applying the inflection rule.}
\end{subfigure}
\hfill
2021-05-19 22:57:48 +03:00
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{fig5-gentle-inflection-after}
2021-05-19 22:57:48 +03:00
\caption{After applying the inflection rule.}
\end{subfigure}
2021-05-19 22:57:51 +03:00
\caption{Figure 5 in \cite{wang1998line}: gentle inflections at the ends of
2021-05-19 22:57:51 +03:00
the bend.}
\label{fig:fig5-gentle-inflection}
\end{figure}
2021-05-19 22:57:51 +03:00
% TODO: figure should not split the text.
The illustration for this section was clear but insufficient: it does not
2021-05-19 22:57:47 +03:00
specify how many vertices should be included when calculating the end-of-bend
2021-05-19 22:57:51 +03:00
inflection. The iterative approach was chosen: as long as the angle is
2021-05-19 22:57:51 +03:00
"right" and the baseline is becoming shorter, the algorithm should keep
2021-05-19 22:57:51 +03:00
re-assigning vertices to different bends. There is no upper bound
2021-05-19 22:57:51 +03:00
on the number of iterations.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
To prove that the algorithm implementation is correct for multiple vertices,
2021-05-19 22:57:51 +03:00
additional example was created and illustrated in
2021-05-19 22:57:47 +03:00
figure~\ref{fig:inflection-1-gentle-inflection}: the rule re-assigns two
2021-05-19 22:57:48 +03:00
vertices to the next bend.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:47 +03:00
\centering
2021-05-19 22:57:48 +03:00
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{inflection-1-gentle-inflection-before}
2021-05-19 22:57:48 +03:00
\caption{Before applying the inflection rule.}
2021-05-19 22:57:47 +03:00
\end{subfigure}
\hfill
2021-05-19 22:57:48 +03:00
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{inflection-1-gentle-inflection-after}
2021-05-19 22:57:48 +03:00
\caption{After applying the inflection rule.}
2021-05-19 22:57:47 +03:00
\end{subfigure}
2021-05-19 22:57:50 +03:00
\caption{Gentle inflection at the end of the bend with multiple vertices.}
\label{fig:inflection-1-gentle-inflection}
2021-05-19 22:57:47 +03:00
\end{figure}
2021-05-19 22:57:48 +03:00
Note that to find and fix the gentle bends' inflections, the algorithm should
run twice, both ways. Otherwise, if it is executed only one way, the steps will
fail to match some bends that should be adjusted. Current implementation works
as follows:
2021-05-19 22:57:47 +03:00
\begin{enumerate}
2021-05-19 22:57:51 +03:00
\item Run the algorithm from the beginning to the end.
2021-05-19 22:57:47 +03:00
\item \label{rev1} Reverse the line and each bend.
\item Run the algorithm again.
\item \label{rev2} Reverse the line and each bend.
\item Return result.
\end{enumerate}
2021-05-19 22:57:51 +03:00
Reversing the line and its bends is straightforward to implement but costly:
2021-05-19 22:57:48 +03:00
the two reversal steps cost additional time and memory. The algorithm could be
made more optimal with a similar version of the algorithm, but the one which
goes backwards. In this case, steps \ref{rev1} and \ref{rev2} could be spared,
that way saving memory and computation time.
2021-05-19 22:57:51 +03:00
The "quite small angle" was arbitrarily chosen to \smallAngle.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Self-Line Crossing When Cutting a Bend}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
When a bend's baseline crosses another bend, it is called self-crossing.
Self-crossing is undesirable for the upcoming bend manipulation operators; therefore,
2021-05-19 22:57:48 +03:00
should be removed. There are a few rules on when and how they should be removed
--- this section explains them in higher detail, discusses their time
complexity and applied optimizations. Figure~\ref{fig:fig6-selfcrossing} is
copied from the original article.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:47 +03:00
\centering
2021-05-19 22:57:50 +03:00
\includegraphics[width=.5\textwidth]{fig6-selfcrossing}
2021-05-19 22:57:51 +03:00
\caption{Originally figure 6: the bend's baseline (orange) is crossing a neighboring bend.}
2021-05-19 22:57:48 +03:00
\label{fig:fig6-selfcrossing}
2021-05-19 22:57:47 +03:00
\end{figure}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
2021-05-19 22:57:48 +03:00
\centering
2021-05-19 22:57:50 +03:00
\includegraphics[width=.5\textwidth]{selfcrossing-1}
2021-05-19 22:57:51 +03:00
\caption{The bend's baseline (orange) is crossing a non-neighboring bend.}
2021-05-19 22:57:48 +03:00
\label{fig:selfcrossing-1-non-neighbor}
\end{figure}
2021-05-19 22:57:51 +03:00
% TODO: figure should not split the text.
2021-05-19 22:57:48 +03:00
Looking at the {\WM} paper alone, it may seem like self-crossing may happen
only with the neighboring bend. This would mean an efficient $O(n)$
implementation\footnote{where $n$ is the number of bends in a line. See
explanation of \textsc{algorithmic complexity} in section~\ref{sec:vocab}.}.
However, as one can see in figure~\ref{fig:selfcrossing-1-non-neighbor}, it may
not be the case: any other bend in the line may be crossing it.
If one translates the requirements to code in a straightforward way, it would
be quite computationally expensive: naively implemented, complexity of checking
every bend with every bend is $O(n^2)$. In other words, the time it takes to
2021-05-19 22:57:51 +03:00
run the algorithm grows quadratically with the number of vertices.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:49 +03:00
It is possible to optimize this step and skip checking a large number of bends.
2021-05-19 22:57:51 +03:00
Only bends, the inner angles' sum of which is larger than $180^\circ$, can ever
2021-05-19 22:57:49 +03:00
self-cross. That way, only a fraction of bends need to be checked. The
worst-case complexity is still $O(n^2)$, when all bends' inner angles are
larger than $180^\circ$. Having this optimization, the algorithmic complexity
2021-05-19 22:57:51 +03:00
(as a result, the time it takes to execute the algorithm) drops by the
fraction of bends, the inner angles' sum of which is smaller than $180^\circ$.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Attributes of a Single Bend}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:50 +03:00
\textsc{compactness index} is "the ratio of the area of the polygon over the
2021-05-19 22:57:47 +03:00
circle whose circumference length is the same as the length of the
2021-05-19 22:57:47 +03:00
circumference of the polygon" \cite{wang1998line}. Given a bend, its
compactness index is calculated as follows:
2021-05-19 22:57:47 +03:00
\begin{enumerate}
2021-05-19 22:57:47 +03:00
\item Construct a polygon by joining first and last vertices of the bend.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item Calculate the area of the polygon $A_{p}$.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:49 +03:00
\item Calculate perimeter $P$ of the polygon. The same value is the
circumference of the circle: $C = P$.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item Given the circle's circumference $C$, the circle's area $A_{c}$ is:
2021-05-19 22:57:47 +03:00
\[
2021-05-19 22:57:51 +03:00
A_c = \frac{C^2}{4\pi}
2021-05-19 22:57:47 +03:00
\]
2021-05-19 22:57:51 +03:00
\item Compactness index $c$ is the area of the polygon $A_p$ divided by the
area of the circle $A_c$:
2021-05-19 22:57:47 +03:00
\[
2021-05-19 22:57:51 +03:00
c = \frac{A_p}{A_c} =
\frac{A_p}{ \frac{C^2}{4\pi} } =
\frac{4\pi A_p}{C^2}
2021-05-19 22:57:47 +03:00
\]
\end{enumerate}
2021-05-19 22:57:51 +03:00
Once this operation is complete, each bend will have a list of properties
2021-05-19 22:57:51 +03:00
which will be used by other modifying operators.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Shape of a Bend}
2021-05-19 22:57:49 +03:00
\label{sec:shape-of-a-bend}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
This section introduces \textsc{adjusted size} $A_{adj}$ which trivially
2021-05-19 22:57:49 +03:00
derives from \textsc{compactness index} $c$ and "polygonized" bend's area $A_{p}$:
2021-05-19 22:57:47 +03:00
\[
2021-05-19 22:57:49 +03:00
A_{adj} = \frac{0.75 A_{p}}{c}
2021-05-19 22:57:47 +03:00
\]
2021-05-19 22:57:51 +03:00
Adjusted size is necessary later to compare bends with each other, or to decide if
2021-05-19 22:57:51 +03:00
the bend is within the simplification threshold.
Sometimes, when working with {\WM}, it is useful to convert between
half-circle's diameter $D$ and adjusted size $A_{adj}$. These easily derive
from circle's area formula $A = 2\pi \frac{D}{2}^2$:
\[
D = 2\sqrt{\frac{2 A_{adj}}{\pi}}
\]
In reverse, adjusted size $A_{adj}$ from half-circle's diameter:
2021-05-19 22:57:49 +03:00
2021-05-19 22:57:51 +03:00
\[
A_{adj} = \frac{\pi D^2}{8}
\]
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Isolated Bend}
2021-05-19 22:57:51 +03:00
\label{sec:isolated-bend}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
Bend itself and its "isolation" can be described by \textsc{average curvature},
2021-05-19 22:57:47 +03:00
which is \textcquote{wang1998line}{geometrically defined as the ratio of
inflection over the length of a curve.}
2021-05-19 22:57:51 +03:00
Two conditions must be followed to claim that a bend is isolated:
2021-05-19 22:57:47 +03:00
\begin{enumerate}
2021-05-19 22:57:51 +03:00
\item \textsc{average curvature} of neighboring bends should be larger
2021-05-19 22:57:48 +03:00
than the "candidate" bend's curvature. The article did not offer a
2021-05-19 22:57:51 +03:00
value; this implementation arbitrarily chose $\isolationThreshold$.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:51 +03:00
\item Bends on both sides of the "candidate" bend should be longer than a
2021-05-19 22:57:47 +03:00
certain value. This implementation does not (yet) define such a
constraint and will only follow the average curvature constraint above.
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:47 +03:00
\end{enumerate}
2021-05-19 22:57:51 +03:00
We believe unclear criteria for \textsc{isolated bend} is one of the main
causes for jagged lines in section~\ref{sec:results}, and is a suggested
further area of research in section~\ref{sec:future-suggestions}.
2021-05-19 22:57:51 +03:00
\subsection{The Context of a Bend: Isolated And Similar Bends}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:47 +03:00
To find out whether two bends are similar, they are compared by 3 components:
\begin{enumerate}
2021-05-19 22:57:50 +03:00
\item \textsc{adjusted size} $A_{adj}$.
\item \textsc{compactness index} $c$.
\item \textsc{baseline length} $l$.
2021-05-19 22:57:47 +03:00
\end{enumerate}
2021-05-19 22:57:48 +03:00
Components 1, 2 and 3 represent a point in a 3-dimensional space, and Euclidean
2021-05-19 22:57:49 +03:00
distance $d(p,q)$ between those is calculated to differentiate bends $p$ and
2021-05-19 22:57:47 +03:00
$q$:
\[
2021-05-19 22:57:49 +03:00
d(p,q) = \sqrt{(A_{adj(p)}-A_{adj(q)})^2 +
(c_p-c_q)^2 +
(l_p-l_q)^2}
2021-05-19 22:57:47 +03:00
\]
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:48 +03:00
The smaller the distance $d$, the more similar the bends are.
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Elimination Operator}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:50 +03:00
Figure~\ref{fig:elimination-through-iterations} illustrates steps of figure 8
2021-05-19 22:57:50 +03:00
from the original paper. There is not much to add to the original description
2021-05-19 22:57:50 +03:00
beyond repeating the elimination steps in an illustrated example.
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
\centering
\begin{subfigure}[b]{.7\textwidth}
\includegraphics[width=\textwidth]{fig8-elimination-gen1}
\caption{Original}
\end{subfigure}
\begin{subfigure}[b]{.7\textwidth}
\includegraphics[width=\textwidth]{fig8-elimination-gen2}
\caption{Iteration 1}
\end{subfigure}
\begin{subfigure}[b]{.7\textwidth}
\includegraphics[width=\textwidth]{fig8-elimination-gen3}
\caption{Iteration 2 (result)}
\end{subfigure}
2021-05-19 22:57:51 +03:00
\caption{Originally figure 8: the bend elimination through iterations.}
2021-05-19 22:57:50 +03:00
\label{fig:elimination-through-iterations}
\end{figure}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Combination Operator}
2021-05-19 22:57:51 +03:00
\label{sec:combination-operator}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:50 +03:00
Combination operator was not implemented in this version.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:47 +03:00
\subsection{Exaggeration Operator}
2021-05-19 22:57:50 +03:00
\label{sec:exaggeration-operator}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
Exaggeration operator finds bends, of which \textsc{adjusted size} is smaller
2021-05-19 22:57:50 +03:00
than the \textsc{diameter of the half-circle}. Once a target bend is found, it
2021-05-19 22:57:51 +03:00
will be exaggerated in increments until either becomes true:
2021-05-19 22:57:50 +03:00
\begin{itemize}
2021-05-19 22:57:51 +03:00
\item \textsc{adjusted size} of the exaggerated bend is larger than the area of
2021-05-19 22:57:50 +03:00
the half-circle.
2021-05-19 22:57:50 +03:00
\item The exaggerated bend starts intersecting with a neighboring bend.
Then exaggeration aborts, and the bend remains as if it were one step
before the intersection.
2021-05-19 22:57:50 +03:00
\end{itemize}
2021-05-19 22:57:50 +03:00
Exaggeration operator uses a hardcoded parameter \textsc{exaggeration step} $s
\in (1,2]$. It was arbitrarily picked to {\exaggerationEnthusiasm} for this
implementation. A single exaggeration increment is done as follows:
2021-05-19 22:57:50 +03:00
\begin{enumerate}
\item Find a candidate bend.
\item Find the bend's baseline.
\item Find \textsc{midpoint}, the center of the bend's baseline.
2021-05-19 22:57:50 +03:00
\item Find \textsc{midbend}, the center of the bend. Distance from one
baseline vertex to \textsc{midbend} should be the same as from
\textsc{midbend} to the other baseline vertex.
2021-05-19 22:57:50 +03:00
\item Mark each bend's vertex with a number between $[1,s]$. The number is
2021-05-19 22:57:51 +03:00
derived with elements linearly between the start vertex and
\textsc{midbend}, with values somewhat proportional to the azimuth
between these lines:
\begin{itemize}
\item \textsc{midbend} and the point.
\item \textsc{midpoint} and the point.
\end{itemize}
The other half of the bend, from \textsc{midbend} to the final vertex,
is linearly interpolated between $[s,1]$, using the same rules as for
the first half.
2021-05-19 22:57:51 +03:00
The first version of the algorithm used simple linear interpolation
based on the point's position in the line. The current version applies
a few coefficients, which were derived empirically, by observing the
2021-05-19 22:57:51 +03:00
resulting bend.
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:50 +03:00
\item Each point (except the beginning and end vertices of the bend) will
be placed farther away from the baseline. The length of misplacement is
the marked value in the previous step.
2021-05-19 22:57:50 +03:00
\end{enumerate}
2021-05-19 22:57:50 +03:00
\begin{figure}[ht]
\centering
\includegraphics[width=.5\textwidth]{isolated-1-exaggerated}
2021-05-19 22:57:51 +03:00
\caption{Example isolated exaggerated bend.}
2021-05-19 22:57:50 +03:00
\label{fig:isolated-1-exaggerated}
\end{figure}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
The technical implementation of the algorithm contains two implementations
of exaggeration operator:
\begin{description}
\item[\normalfont\textsc{wm\_exaggerate\_bend}] is the original one. It
uses simple linear interpolation. It is fast, but simple. It tends to
leave jagged bends.
\item[\normalfont\textsc{wm\_exaggerate\_bend2}] is a more computationally
expensive function, which leaves better-looking exaggerated bends.
\end{description}
Both functions are inter-change-able and can be found in listing~\ref{lst:wm.sql}.
Figure~\ref{fig:isolated-1-exaggerated} illustrates an exaggerated bend using
\textsc{wm\_exaggerate\_bend2}.
2021-05-19 22:57:50 +03:00
\section{Results}
2021-05-19 22:57:51 +03:00
\label{sec:results}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
This section visualizes the results, discusses robustness and issues of the
generalization, and suggests specific improvements.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
One of our goals is to compare the generalized lines with the official
generalized dataset\cite{nzt}. Therefore, we have selected the target scales
that the official sources offer too: 1:\numprint{50000} and
1:\numprint{250000}. The \textsc{dhalfcircle} values for the subset are as
follows:
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\begin{table}[ht]
\centering
\begin{tabular}{ c D{.}{.}{1} }
Scale & \multicolumn{1}{c}{$D(m)$} \\ \hline
1:\numprint{50000} & 75 \\
1:\numprint{250000} & 220 \\
\end{tabular}
\end{table}
Our generalized results are viewed from the following angles:
2021-05-19 22:57:51 +03:00
\begin{itemize}
2021-05-19 22:57:51 +03:00
\item Compare to the non-simplified originals.
\item Compare to the official datasets.
\item Compare to {\DP} and {\VW}.
2021-05-19 22:57:51 +03:00
\end{itemize}
2021-05-19 22:57:51 +03:00
\subsection{Generalization Results of Analyzed Rivers}
\label{sec:generalization-results-of-analyzed-rivers}
2021-05-19 22:57:51 +03:00
\subsubsection{Medium-scale (1:\numprint{50000})}
2021-05-19 22:57:51 +03:00
\label{sec:analyzed-medium-scale}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\begin{figure}[h!]
2021-05-19 22:57:50 +03:00
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-wm-50k}
\caption{2x zoomed-in {\WM} for 1:\numprint{50000}.}
2021-05-19 22:57:51 +03:00
\label{fig:salvis-wm-50k}
\end{figure}
2021-05-19 22:57:51 +03:00
As one can see in figure~\ref{fig:salvis-wm-50k}, the illustrations deliver
2021-05-19 22:57:51 +03:00
what was promised by the algorithm, but with a few caveats. Left side of the
figure looks reasonably well simplified: long bends remain slightly curved,
2021-05-19 22:57:51 +03:00
small bends are removed or slightly exaggerated.
2021-05-19 22:57:51 +03:00
Figure's~\ref{fig:salvis-wm-50k} left part is clipped to
figure~\ref{fig:salvis-wm-50k-nw}. As one can see, some bends were well
exaggerated, and some bends were eliminated.
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{salvis-wm-50k-nw}
\caption{Left part of figure~\ref{fig:salvis-wm-50k}.}
\label{fig:salvis-wm-50k-nw}
\end{figure}
Top--right side (clipped in figure~\ref{fig:salvis-wm-50k-ne}) some jagged
and sharp bends appear. These will become more pronounced in even larger-scale
2021-05-19 22:57:51 +03:00
simplification in the next section.
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\begin{figure}[h!]
2021-05-19 22:57:51 +03:00
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-wm-50k-ne}
\caption{Top--right part of figure~\ref{fig:salvis-wm-50k}.}
\label{fig:salvis-wm-50k-ne}
2021-05-19 22:57:51 +03:00
\end{figure}
2021-05-19 22:57:51 +03:00
To sum up, mid-scale simplification works well for some geometries, but creates
sharp edges for others.
2021-05-19 22:57:51 +03:00
\subsubsection{Large-scale (1:\numprint{250000})}
2021-05-19 22:57:51 +03:00
\label{sec:analyzed-large-scale}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
As visible in figure~\ref{fig:salvis-wm-250k-10x}, for large-scale map, some of the
2021-05-19 22:57:51 +03:00
resulting bends look significantly exaggerated. Why is that?
Figure~\ref{fig:salvis-wm-250k-overlaid-zoom} zooms in the large-scale
2021-05-19 22:57:51 +03:00
simplification and overlays the original.
2021-05-19 22:57:51 +03:00
\begin{figure}[ht]
\centering
\begin{subfigure}[b]{.49\textwidth}
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=.2\textwidth]{salvis-250k-10x}
2021-05-19 22:57:51 +03:00
\caption{Original.}
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=.2\textwidth]{salvis-wm-250k-10x}
2021-05-19 22:57:51 +03:00
\caption{Simplified.}
\end{subfigure}
2021-05-19 22:57:51 +03:00
\caption{GDB10LT simplified with {\WM} for 1:\numprint{250000}.}
2021-05-19 22:57:51 +03:00
\label{fig:salvis-wm-250k-10x}
2021-05-19 22:57:50 +03:00
\end{figure}
\begin{figure}[ht]
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=.8\textwidth]{salvis-wm-overlaid-250k-zoom}
2021-05-19 22:57:51 +03:00
\caption{10x zoomed-in {\WM} for 1:\numprint{250000}.}
\label{fig:salvis-wm-250k-overlaid-zoom}
\end{figure}
2021-05-19 22:57:51 +03:00
A conglomeration of bends is visible, especially in top--right side of the
illustration. We assume this was caused by two bends significantly exaggerated,
leaving no space to exaggerate for those between the two.
\subsubsection{Discussion}
For mid-size scales of 1:\numprint{50000}, the implemented algorithm works well
for certain geometries, and poorly for others. This test surfaced two areas for
future research and improvement:
\begin{itemize}
\item Exaggeration is sometimes creating sharp edges, especially when the
exaggerated bend is quite small. When sharp edges are created,
exaggeration could interpolate more points in the bend, and exaggerate
using the interpolated points.
\item In larger scales, when bends do not have space to exaggerate, they
should be combined or eliminated instead.
\end{itemize}
2021-05-19 22:57:51 +03:00
\subsection{Comparison with National Spatial Datasets}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsubsection{Background}
There are a few datasets used in this comparison: GDB10LT, GDR50LT and
GDR250LT. They are vector datasets, which include rivers. They can be
downloaded for free from \cite{nzt}. Here are the meanings of the codenames:
\begin{description}
\item[GDB10LT] is dataset of highest detail. Suited for maps of scale
1:\numprint{10000}.
\item[GDR50LT] is suited for maps of scale 1:\numprint{50000}.
\item[GDR250LT] offers least detail, and is suited for maps of
scale 1:\numprint{250000}.
\end{description}
During the analysis, we ran {\WM} on GDB10LT for 2 destination scales:
1:\numprint{50000} and 1:\numprint{250000}.\footnote{parameter calculation is
detailed in section~\ref{sec:bend-scaling-and-dimensions}.} This section
compares the resulting {\WM}--generalized rivers to GDR50LT and GDR250LT.
\subsubsection{Medium-scale (1:\numprint{50000})}
For our research location, the national dataset GDB10LT is almost equivalent to
2021-05-19 22:57:51 +03:00
GDR50LT, with a few nuances. Figure~\ref{fig:salvis-wm-gdr50} illustrates
all three shapes: GDR50LT, {\WM}--simplified GDB10LT, and the original GDB10LT.
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\begin{figure}[h!]
2021-05-19 22:57:51 +03:00
\centering
2021-05-19 22:57:51 +03:00
\includegraphics[width=\textwidth]{salvis-wm-gdr50}
2021-05-19 22:57:51 +03:00
\caption{2x zoomed-in GDR50LT (green), {\WM}--simplified GDB10LT (orange)
2021-05-19 22:57:51 +03:00
and original GDB10LT (dotted black).}
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\label{fig:salvis-wm-gdr50}
2021-05-19 22:57:51 +03:00
\end{figure}
2021-05-19 22:57:51 +03:00
\begin{figure}[h!]
\centering
\includegraphics[width=\textwidth]{salvis-wm-gdr50-ne}
\caption{Top--right side of figure~\ref{fig:salvis-wm-gdr50}.}
\label{fig:salvis-wm-gdr50-ne}
\end{figure}
Although figures are almost identical, figure~\ref{fig:salvis-wm-gdr50-ne}
illustrates two small bends that have been removed in GDR50LT, but have been
exaggerated by our implementation.
2021-05-19 22:57:51 +03:00
\subsubsection{Large-scale (1:\numprint{250000})}
2021-05-19 22:57:51 +03:00
\label{sec:national-large-scale}
Figure~\ref{fig:salvis-wm-250k} illustrates the original GDR250LT and the
{\WM}--simplified version. As section~\ref{sec:analyzed-large-scale} explains,
the algorithm tries to exaggerate many bends to a great size. However, GDR250LT
takes the opposite approach --- only the very basic shapes of the largest bends
are retained. Time and customers will tell, which approach is more appropriate,
after the current {\WM} implementation receives some time and attention, as
desired in section~\ref{sec:future-suggestions}.
\begin{figure}[h!]
\centering
\begin{subfigure}[b]{.49\textwidth}
\includegraphics[width=\textwidth]{salvis-gdr250-2x}
\caption{GDR250LT.}
\end{subfigure}
\hfill
\begin{subfigure}[b]{.49\textwidth}
\centering
\includegraphics[width=\textwidth]{salvis-wm-220}
\caption{{\WM}-simplified GDB10LT.}
\end{subfigure}
\caption{GDR250LT and {\WM}--simplified GDB10LT.}
\label{fig:salvis-wm-250k}
\end{figure}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Comparison with {\DP} and {\VW}}
It is time to visually compare our implementation with the classical
algorithms: {\DP}, {\VW} and Chaikin. Since we have established more work is
needed for small-scale maps (1:\numprint{250000}), we will limit the comparison
in this section to 1:\numprint{50000}.
\begin{figure}[h!]
\includegraphics[width=\textwidth]{salvis-wm-dp-50k}
\caption{{\DP} (green), {\WM} (orange) and original (black dotted) at
1:\numprint{50000}.}
\label{fig:salvis-wm-dp-50k}
\end{figure}
\begin{figure}[h!]
\includegraphics[width=\textwidth]{salvis-wm-dp-chaikin-50k}
\caption{Chaikin--smoothened {\DP} (green), {\WM} (orange) and original
(black dotted) at 1:\numprint{50000}.}
\label{fig:salvis-wm-dp-chaikin-50k}
\end{figure}
\begin{figure}[h!]
\includegraphics[width=\textwidth]{salvis-wm-vw-50k}
\caption{{\VW} (green), {\WM} (orange) and original (black dotted) at
1:\numprint{50000}.}
\label{fig:salvis-wm-vw-50k}
\end{figure}
\begin{figure}[h!]
\includegraphics[width=\textwidth]{salvis-wm-vw-chaikin-50k}
\caption{Chaikin--smoothened {\VW} (green), {\WM} (orange) and original
(black dotted) at 1:\numprint{50000}.}
\label{fig:salvis-wm-vw-chaikin-50k}
\end{figure}
2021-05-19 22:57:51 +03:00
\subsection{Testing Results Online}
2021-05-19 22:57:51 +03:00
\label{sec:testing-results-online}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
An on-line tool\cite{openmapwm} has been developed to test incoming parameters
to {\WM} algorithm. A user should select a river of interest, enter the
\textsc{dhalfcircle} parameter and click "Submit". The simplified line feature
will be overlaid on top of the map.
Figure~\ref{fig:openmap-wm-good} illustrates the end result that looks
reasonably well. Figure~\ref{fig:openmap-wm-bad} illustrates that the algorithm
produces poorly simplified results for some geometries.
\begin{figure}[ht]
\centering
\includegraphics[width=\textwidth]{openmap-wm-good.png}
\caption{Example on-line test tool for {\WM} algorithm.}
\label{fig:openmap-wm-good}
\end{figure}
\begin{figure}[ht]
\centering
\includegraphics[width=.5\textwidth]{openmap-wm-bad.png}
\caption{Another example from the on-line test tool.}
\label{fig:openmap-wm-bad}
\end{figure}
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:47 +03:00
\section{Conclusions}
2021-05-19 22:57:46 +03:00
\label{sec:conclusions}
2021-05-19 22:57:51 +03:00
Classical and modern algorithms line simplification algorithms were evaluated,
2021-05-19 22:57:51 +03:00
main problems with them identified. A method for {\WM} technical
implementation was defined, and the algorithm implemented. Each geometric
2021-05-19 22:57:51 +03:00
transformation was described and visualized. The implemented algorithm was
applied for different shapes and compared to national (Lithuanian) datasets.
2021-05-19 22:57:51 +03:00
About 1,000 lines of Procedural SQL were written for the algorithm and tests,
2021-05-19 22:57:51 +03:00
and a few hundred lines of supporting scripts in Make, Python, Awk, Bash. With
the help of its permissive license and early interest, the algorithm code has
2021-05-19 22:57:51 +03:00
already been used to create a prototype on-line service to evaluate the
algorithm robustness.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\section{Future Suggestions}
2021-05-19 22:57:51 +03:00
\label{sec:future-suggestions}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
These are the areas for possible future work with this, published,
implementation:
\begin{itemize}
\item Implement bend combination operator
(section~\ref{sec:combination-operator}).
\item Fine-tune parameters for bend exaggeration.
Section~\ref{sec:generalization-results-of-analyzed-rivers} contains
2021-05-19 22:57:51 +03:00
a exaggerated bends that became sharp and includes some future ideas.
2021-05-19 22:57:51 +03:00
2021-05-19 22:57:51 +03:00
\item What are the exaggeration limits when working with large scales?
2021-05-19 22:57:51 +03:00
Section~\ref{sec:national-large-scale} discusses examples that some
2021-05-19 22:57:51 +03:00
limits are necessary.
2021-05-19 22:57:51 +03:00
\item Research when bends should be marked as \textsc{isolated}. As is
seen from examples, the current criteria is not robust enough.
\item Once the points above yield a satisfactory result, efficiency of the
algorithm could be improved to work on the lines in "streaming" fashion
(more details in section~\ref{sec:algorithm-implementation-process}).
\end{itemize}
That sums up what could be improved without changing the algorithm in a
significant way. Other than that, further area of research is working towards
graduating the algorithm from "isolated cartographic generalization" to "full
cartographic generalization". The current operators of {\WM} algorithm have a
few venues to preserve the surrounding topology. This could be further
researched and extended.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\section{Acknowledgments}
\label{sec:acknowledgments}
2021-05-19 22:57:51 +03:00
I would like to thank my thesis supervisor, Andrius Balčiūnas, for his help in
2021-05-19 22:57:51 +03:00
formulating the requirements and providing early editorial feedback for the
thesis.
I am grateful to Tomas Straupis, who handed me the {\WM}\cite{wang1998line}
2021-05-19 22:57:51 +03:00
paper on a warm pre-COVID summer evening. I got intrigued. He was also an early
2021-05-19 22:57:51 +03:00
beta-tester of my implementation, and helped me understand where the initial
algorithm descriptions were ambiguous.
Many thanks to NŽT for providing the datasets with a very permissive license.
2021-05-19 22:57:46 +03:00
\printbibliography
\begin{appendices}
2021-05-19 22:57:51 +03:00
\section{Code Listings}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
This section contains code listings of the {\WM} algorithm.
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Re-Generating This Paper}
2021-05-19 22:57:48 +03:00
\label{sec:code-regenerate}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:48 +03:00
Like explained in section~\ref{sec:reproducing-the-paper}, illustrations in
this paper are generated from a small list of sample geometries. To observe
the source geometries or regenerate this paper, run this script (assuming
2021-05-19 22:57:51 +03:00
the name of this document is \textsc{mj-msc-full.pdf}).
2021-05-19 22:57:50 +03:00
2021-05-19 22:57:51 +03:00
Listing~\ref{lst:extract-and-generate} will extract the source files from
the \textsc{mj-msc-full.pdf} to a temporary directory, run the top-level
\textsc{make} command, and display the generated document. Source code for
the algorithm, as well as other supporting files, can be found in the
temporary directory.
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
\begin{longlisting}
2021-05-19 22:57:51 +03:00
\inputcode{bash}{extract-and-generate}
\caption{\textsc{extract-and-generate}}
\label{lst:extract-and-generate}
2021-05-19 22:57:51 +03:00
\end{longlisting}
2021-05-19 22:57:46 +03:00
2021-05-19 22:57:51 +03:00
\subsection{Function \textsc{st\_simplifywm}}
2021-05-19 22:57:51 +03:00
\begin{longlisting}
\inputcode{postgresql}{wm.sql}
\caption{\textsc{wm.sql}}
\label{lst:wm.sql}
\end{longlisting}
2021-05-19 22:57:47 +03:00
2021-05-19 22:57:50 +03:00
\subsection{Function \textsc{aggregate\_rivers}}
2021-05-19 22:57:51 +03:00
\begin{longlisting}
\inputcode{postgresql}{aggregate-rivers.sql}
\caption{\textsc{aggregate-rivers.sql}}
\label{lst:aggregate-rivers.sql}
\end{longlisting}
2021-05-19 22:57:48 +03:00
2021-05-19 22:57:46 +03:00
\end{appendices}
\end{document}