586 lines
21 KiB
TeX
586 lines
21 KiB
TeX
\documentclass[a4paper]{article}
|
|
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[english]{babel}
|
|
\usepackage[utf8]{inputenc}
|
|
\usepackage{a4wide}
|
|
\usepackage [autostyle, english=american]{csquotes}
|
|
\MakeOuterQuote{"}
|
|
\usepackage[maxbibnames=99,style=numeric,sorting=none]{biblatex}
|
|
\addbibresource{bib.bib}
|
|
\usepackage[pdfusetitle]{hyperref}
|
|
\usepackage{enumitem}
|
|
\usepackage[toc,page,title]{appendix}
|
|
\usepackage{caption}
|
|
\usepackage{subcaption}
|
|
\usepackage{gensymb}
|
|
\usepackage{units}
|
|
\usepackage{varwidth}
|
|
\usepackage{tabularx}
|
|
\usepackage{float}
|
|
\usepackage{tikz}
|
|
\usepackage{minted}
|
|
\usepackage{fancyvrb}
|
|
\input{version.inc}
|
|
\input{vars.inc}
|
|
|
|
\newcommand{\onpage}[1]{\ref{#1} on page~\pageref{#1}}
|
|
\newcommand{\titlecite}[1]{\citetitle{#1} \cite{#1}}
|
|
\newcommand{\DP}{Douglas \& Peucker}
|
|
\newcommand{\VW}{Visvalingam--Whyatt}
|
|
\newcommand{\WM}{Wang--M{\"u}ller}
|
|
\newcommand{\MYTITLE}{Cartographic Generalization of Lines using free software (example of rivers)}
|
|
\newcommand{\MYAUTHOR}{Motiejus Jakštys}
|
|
|
|
\title{\MYTITLE}
|
|
\author{\MYAUTHOR}
|
|
\date{\VCDescribe}
|
|
|
|
\begin{document}
|
|
|
|
\begin{titlepage}
|
|
\begin{center}
|
|
\includegraphics[width=0.4\textwidth]{vu}
|
|
|
|
\huge
|
|
\textbf{\MYTITLE} \\[4ex]
|
|
|
|
\LARGE
|
|
\textbf{\MYAUTHOR} \\[8ex]
|
|
|
|
\vfill
|
|
|
|
A thesis presented for the degree of\\
|
|
Master in Cartography \\[3ex]
|
|
|
|
\large
|
|
\VCDescribe
|
|
\end{center}
|
|
\end{titlepage}
|
|
|
|
\begin{abstract}
|
|
\label{sec:abstract}
|
|
Current open-source line generalization solutions have their roots in
|
|
mathematics and geometry, and are not fit for natural objects like rivers
|
|
and coastlines. This paper discusses our implementation of {\WM} algorithm
|
|
under and open-source license, explains things that we would had
|
|
appreciated in the original paper and compares our results to different
|
|
generalization algorithms.
|
|
\end{abstract}
|
|
|
|
\newpage
|
|
|
|
\tableofcontents
|
|
\listoffigures
|
|
|
|
\newpage
|
|
|
|
\section{Introduction}
|
|
\label{sec:introduction}
|
|
|
|
When creating small-scale maps, often the detail of the data source is greater
|
|
than desired for the map. This becomes especially acute for natural features
|
|
that have many bends, like coastlines, rivers and forest boundaries.
|
|
|
|
To create a small-scale map from a large-scale data source, these features need
|
|
to be generalized: detail should be reduced. However, while doing so, it is
|
|
important to preserve the "defining" shape of the original feature, otherwise
|
|
the result will look unrealistic.
|
|
|
|
For example, if a river is nearly straight, it should be nearly straight after
|
|
generalization, otherwise a too straightened river will look like a canal.
|
|
Conversely, if the river is highly wiggly, the number of bends should be
|
|
reduced, but not removed.
|
|
|
|
Generalization problem for other objects can often be solved by other
|
|
non-geometric means:
|
|
|
|
\begin{itemize}
|
|
\item Towns and cities can be filtered and generalized by number of
|
|
inhabitants.
|
|
\item Roads can be eliminated by the road length, number of lanes, or
|
|
classification of the road (local, regional, international).
|
|
\end{itemize}
|
|
|
|
Natural line generalization problem can be viewed as having two competing
|
|
goals:
|
|
|
|
\begin{itemize}
|
|
\item Reduce detail by removing or simplifying "less important" features.
|
|
\item Retain enough detail, so the original is still recognize-able.
|
|
\end{itemize}
|
|
|
|
Given the discussed complexities, a fine line between under-generalization
|
|
(leaving object as-is) and over-generalization (making a straight line) must be
|
|
found. Therein lies the complexity of generalization algorithms: all have
|
|
different trade-offs.
|
|
|
|
\section{Literature review and problematic}
|
|
\label{sec:literature-review}
|
|
|
|
A number of cartographic line generalization algorithms have been researched.
|
|
The "classical" ones are {\DP} and {\VW}.
|
|
|
|
\subsection{Available algorithms}
|
|
|
|
\subsubsection{{\DP}, {\VW} and Chaikin's}
|
|
|
|
{\DP} \cite{douglas1973algorithms} and {\VW} \cite{visvalingam1993line} are
|
|
"classical" line generalization computer graphics algorithms. They are
|
|
relatively simple to implement, require few runtime resources. Both of them
|
|
accept only a single parameter, based on desired scale of the map, which makes
|
|
them very simple to adjust for different scales.
|
|
|
|
Both algorithms are part of PostGIS, a free-software GIS suite:
|
|
\begin{itemize}
|
|
\item {\DP} via
|
|
\href{https://postgis.net/docs/ST_Simplify.html}{PostGIS Simplify}.
|
|
|
|
\item {\VW} via
|
|
\href{https://postgis.net/docs/ST_SimplifyVW.html}{PostGIS SimplifyVW}.
|
|
\end{itemize}
|
|
|
|
Examples of <TBD Chaikin and others>
|
|
|
|
It may be worthwhile to post-process those through a widely available Chaikin's
|
|
line smoothing algorithm \cite{chaikin1974algorithm} via
|
|
\href{https://postgis.net/docs/ST_ChaikinSmoothing.html}{PostGIS
|
|
ChaikinSmoothing}.
|
|
|
|
|
|
\subsubsection{Modern approaches}
|
|
|
|
Due to their simplicity and ubiquity, {\DP} and {\VW} have been established as
|
|
go-to algorithms for line generalization. During recent years, alternatives
|
|
have emerged. These modern replacements fall into roughly two categories:
|
|
|
|
\begin{itemize}
|
|
|
|
\item Cartographic knowledge was encoded to an algorithm (bottom-up
|
|
approach). One among these are \titlecite{wang1998line}, also known
|
|
as {\WM}'s algorithm.
|
|
|
|
\item Mathematical shape transformation which yields a more cartographic
|
|
result. E.g. \titlecite{jiang2003line},
|
|
\titlecite{dyken2009simultaneous}, \titlecite{mustafa2006dynamic},
|
|
\titlecite{nollenburg2008morphing}.
|
|
|
|
\end{itemize}
|
|
|
|
Authors of most of the aforementioned articles have implemented the
|
|
generalization algorithm, at least to generate the visuals in the articles.
|
|
However, I wasn't able to find code for any of those to evaluate with my
|
|
desired data set, or use as a basis for my own maps. {\WM} \cite{wang1998line}
|
|
is available in a commercial product.
|
|
|
|
Lack of robust openly available generalization algorithm implementations poses
|
|
a problem for map creation with free software: there is not a similar
|
|
high-quality simplification algorithm to create down-scaled maps, so any
|
|
cartographic work, which uses line generalization as part of its processing,
|
|
will be of sub-par quality. We believe that availability of high-quality
|
|
open-source tools is an important foundation for future cartographic
|
|
experimentation and development, thus it it benefits the cartographic society
|
|
as a whole.
|
|
|
|
\subsection{Problematic with generalization of rivers}
|
|
|
|
\section{Methodology}
|
|
\label{sec:methodology}
|
|
|
|
The original {\WM}'s algorithm \cite{wang1998line} leaves something to be
|
|
desired for a practical implementation: it is not straightforward to implement
|
|
the algorithm from the paper alone.
|
|
|
|
Explanations in this document are meant to expand, rather than substitute, the
|
|
original description in {\WM}. Therefore familiarity with the original paper is
|
|
assumed, and, for some sections, having it close-by is necessary to
|
|
meaningfully follow this document.
|
|
|
|
In this paper we describe {\WM} in a detail that is more useful for algorithm:
|
|
each section will be expanded, with more elaborate and exact illustrations for
|
|
every step of the algorithm.
|
|
|
|
Algorithms discussed in this paper assume Euclidean geometry.
|
|
|
|
\subsection{Vocabulary and terminology}
|
|
|
|
This section defines vocabulary and terms as defined in the rest of the paper.
|
|
|
|
\begin{description}
|
|
|
|
\item[Vertex] is a point on a plane, can be expressed by a pair of $(x,y)$
|
|
coordinates.
|
|
|
|
\item[Line Segment (or Segment)] joins two vertices by a straight line. A
|
|
segment can be expressed by two coordinate pairs: $(x_1, y_1)$ and
|
|
$(x_2, y_2)$. Line Segment and Segment are used interchangeably
|
|
throughout the paper.
|
|
|
|
\item[Line] represents a single linear feature in the real world. For
|
|
example, a river or a coastline. {\tt LINESTRING} in GIS terms.
|
|
|
|
Geometrically, A line is a series of connected line segments, or,
|
|
equivalently, a series of connected vertices. Each vertex connects to
|
|
two other vertices, except those vertices at either ends of the line:
|
|
these two connect to a single other vertex.
|
|
|
|
\item[Bend] is a subset of a line that humans perceive as a curve. The
|
|
geometric definition is complex and is discussed in
|
|
section~\onpage{sec:definition-of-a-bend}.
|
|
|
|
\item[Baseline] is a line between bend's first and last vertex.
|
|
|
|
\item[Sum of inner angles] TBD.
|
|
|
|
\end{description}
|
|
|
|
\subsection{Radians and Degrees}
|
|
|
|
This document contains a few constant angles expressed in radians.
|
|
Table~\ref{table:radians} summarizes some of the values used in this document
|
|
and the implementation.
|
|
|
|
\begin{table}[h]
|
|
\centering
|
|
\begin{tabular}{|c|c|c|c|c|c|c|}
|
|
\hline
|
|
Degrees & $30^\circ$ & $45^\circ$ & $90^\circ$ & $180^\circ$ & $360^\circ$ \\
|
|
\hline
|
|
Radians & $\nicefrac{\pi}{6}$ & $\nicefrac{\pi}{4}$ & $\nicefrac{\pi}{2}$ & $\pi$ & $2\pi$ \\
|
|
\hline
|
|
\end{tabular}
|
|
\caption{Popular degree and radian values}
|
|
\label{table:radians}
|
|
\end{table}
|
|
|
|
\subsection{Automated tests}
|
|
\label{sec:automated-tests}
|
|
|
|
As part of the algorithm realization, an automated test suite has been
|
|
developed. Shapes to test each function have been hand-crafted and expected
|
|
results have been manually calculated. The test suite executes parts of the
|
|
algorithm against a predefined set of geometries, and asserts that the output
|
|
matches the resulting hand-calculated geometry.
|
|
|
|
The full set of test geometries is visualized in
|
|
figure~\onpage{fig:test-figures}. The figure includes arrows depicting line
|
|
direction.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=\linewidth]{test-figures}
|
|
\caption{Line geometries for automated test cases}
|
|
\label{fig:test-figures}
|
|
\end{figure}
|
|
|
|
The full test suite can be executed with a single command, and completes in a
|
|
few seconds. Having an easily accessible test suite boosts confidence that no
|
|
unexpected bugs have snug in while modifying the algorithm.
|
|
|
|
\section{Description of the implementation}
|
|
|
|
Like alluded in section~\onpage{sec:introduction}, {\WM} paper skims over
|
|
certain details, which are important to implement the algorithm. This section
|
|
goes through each algorithm stage, illustrating the intermediate steps and
|
|
explaining the author's desiderata for a more detailed description.
|
|
|
|
Illustrations of the following sections are extracted from the automated test
|
|
cases, which were written during the algorithm implementation (as discussed in
|
|
section~\onpage{sec:automated-tests}).
|
|
|
|
Lines in illustrations are black, and bends are heavily colored after
|
|
converting them to polygons. Bends are converted to polygons (for illustration
|
|
purposes) using the following algorithm:
|
|
|
|
\begin{itemize}
|
|
\item Join the first and last vertices of the bend, creating a polygon.
|
|
\item Color the polygons using distinct colors.
|
|
\end{itemize}
|
|
|
|
\subsection{Definition of a Bend}
|
|
\label{sec:definition-of-a-bend}
|
|
|
|
The original article describes a bend as:
|
|
|
|
\begin{displaycquote}{wang1998line}
|
|
A bend can be defined as that part of a line which contains a number of
|
|
subsequent vertices, with the inflection angles on all vertices included in
|
|
the bend being either positive or negative and the inflection of the bend's
|
|
two end vertices being in opposite signs.
|
|
\end{displaycquote}
|
|
|
|
While it gives a good intuitive understanding of what the bend is, this section
|
|
provides more technical details. Here are some non-obvious characteristics that
|
|
are necessary when writing code to detect the bends:
|
|
|
|
\begin{itemize}
|
|
\item End segments of each line should also belong to bends. That way, all
|
|
segments belong to 1 or 2 bends.
|
|
|
|
\item First and last segments of each bend (except for the two end-line
|
|
segments) is also the first vertex of the next bend.
|
|
\end{itemize}
|
|
|
|
Properties above may be apparent when looking at illustrations at this article
|
|
or reading here, but they are nowhere as such when looking at the original
|
|
article.
|
|
|
|
Figure~\ref{fig:fig8-definition-of-a-bend} illustrates article's Figure 8,
|
|
but with bends colored as polygons: each color is a distinctive bend.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=\linewidth]{fig8-definition-of-a-bend}
|
|
\caption{Originally Figure 8: detected bends are highlighted}
|
|
\label{fig:fig8-definition-of-a-bend}
|
|
\end{figure}
|
|
|
|
\subsection{Gentle Inflection at End of a Bend}
|
|
|
|
The gist of the section is in the original article:
|
|
|
|
\begin{displaycquote}{wang1998line}
|
|
But if the inflection that marks the end of a bend is quite small, people
|
|
would not recognize this as the bend point of a bend
|
|
\end{displaycquote}
|
|
|
|
Figure~\ref{fig:fig5-gentle-inflection} visualizes original paper's Figure 5,
|
|
when a single vertex is moved outwards the end of the bend.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\begin{subfigure}[b]{.49\textwidth}
|
|
\includegraphics[width=\textwidth]{fig5-gentle-inflection-before}
|
|
\caption{Before applying the inflection rule}
|
|
\end{subfigure}
|
|
\hfill
|
|
\begin{subfigure}[b]{.49\textwidth}
|
|
\includegraphics[width=\textwidth]{fig5-gentle-inflection-after}
|
|
\caption{After applying the inflection rule}
|
|
\end{subfigure}
|
|
\caption{Originally Figure 5: gentle inflections at the ends of the bend}
|
|
\label{fig:fig5-gentle-inflection}
|
|
\end{figure}
|
|
|
|
The illustration for this section was clear, but insufficient: it does not
|
|
specify how many vertices should be included when calculating the end-of-bend
|
|
inflection. We chose the iterative approach --- as long as the angle is "right"
|
|
and the distance is decreasing, the algorithm should keep re-assigning vertices
|
|
to different bends; practically not having an upper bound on the number of
|
|
iterations.
|
|
|
|
To prove that the algorithm implementation is correct for multiple vertices,
|
|
additional example was created, and illustrated in
|
|
figure~\ref{fig:inflection-1-gentle-inflection}: the rule re-assigns two
|
|
vertices to the next bend instead of one.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\begin{subfigure}[b]{.45\textwidth}
|
|
\includegraphics[width=\textwidth]{inflection-1-gentle-inflection-before}
|
|
\caption{Before applying the inflection rule}
|
|
\end{subfigure}
|
|
\hfill
|
|
\begin{subfigure}[b]{.45\textwidth}
|
|
\includegraphics[width=\textwidth]{inflection-1-gentle-inflection-after}
|
|
\caption{After applying the inflection rule}
|
|
\end{subfigure}
|
|
\caption{Gentle inflection at the end of the bend when multiple vertices is moved}
|
|
\label{fig:inflection-1-gentle-inflection}
|
|
\end{figure}
|
|
|
|
To find and fix the gentle bends' inflections requires to run the algorithm in
|
|
both directions; if implemented as documented, the steps will fail to match
|
|
some bends that should be mutated. This implementation does it in the following
|
|
way:
|
|
|
|
\begin{enumerate}
|
|
\item Run the algorithm from beginning to the end.
|
|
\item \label{rev1} Reverse the line and each bend.
|
|
\item Run the algorithm again.
|
|
\item \label{rev2} Reverse the line and each bend.
|
|
\item Return result.
|
|
\end{enumerate}
|
|
|
|
The current implementation is the most straightforward, but not optimal:
|
|
reversing of lines and bends could be avoided by walking backwards the lines.
|
|
In this case, steps \ref{rev1} and \ref{rev2} could be spared, thus saving
|
|
memory and computation time.
|
|
|
|
The "quite small angle" was arbitrarily chosen to $\smallAngle$.
|
|
|
|
\subsection{Self-line Crossing When Cutting a Bend}
|
|
|
|
When bend's baseline crosses another bend, it is called self-crossing. This is
|
|
undesirable in the upcoming operators, and self-crossings should be removed
|
|
following the rules of the article.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\begin{subfigure}[b]{.4\textwidth}
|
|
\includegraphics[width=\textwidth]{fig6-selfcrossing-before}
|
|
\caption{Bend's baseline (dotted) is crossing a neighboring bend}
|
|
\end{subfigure}
|
|
\hfill
|
|
\begin{subfigure}[b]{.4\textwidth}
|
|
\includegraphics[width=\textwidth]{fig6-selfcrossing-after}
|
|
\caption{Self-crossing removed following the algorithm}
|
|
\end{subfigure}
|
|
\caption{Originally Figure 6: simple case of self-line crossing}
|
|
\label{fig:fig6-selfcrossing}
|
|
\end{figure}
|
|
|
|
The self-line-crossing may happen not by the neighboring bend, but by any other
|
|
bend in the line. For example, the baseline of the bend $(A, B)$ may cross
|
|
different bends in between, as depicted in
|
|
figure~\onpage{fig:selfcrossing-1-non-neighbor}.
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\begin{subfigure}[b]{.4\textwidth}
|
|
\includegraphics[width=\textwidth]{selfcrossing-1-before}
|
|
\caption{Bend's baseline (dotted) is crossing a non-neighboring bend}
|
|
\end{subfigure}
|
|
\hfill
|
|
\begin{subfigure}[b]{.4\textwidth}
|
|
\includegraphics[width=\textwidth]{selfcrossing-1-after}
|
|
\caption{Self-crossing removed following the algorithm}
|
|
\end{subfigure}
|
|
\caption{Self-crossing with non-neighboring bend}
|
|
\label{fig:selfcrossing-1-non-neighbor}
|
|
\end{figure}
|
|
|
|
Naively implemented, checking every bend with every bend is costs $O(n^2)$. In
|
|
other words, the time it takes to run the algorithm grows quadratically with
|
|
the with the number of vertices.
|
|
|
|
It is possible to optimize this step and skip checking some of the bends. Only
|
|
bends whose sum of inner angles is $\pi$ can ever self-cross. If the value is
|
|
less than $\pi$, it cannot cross other bends. That way, only a fraction of
|
|
bends need to be checked.
|
|
|
|
\subsection{Attributes of a Single Bend}
|
|
|
|
\textsc{Compactness Index} is "the ratio of the area of the polygon over the
|
|
circle whose circumference length is the same as the length of the
|
|
circumference of the polygon" \cite{wang1998line}. Given a bend, its
|
|
compactness index is calculated as follows:
|
|
|
|
\begin{enumerate}
|
|
|
|
\item Construct a polygon by joining first and last vertices of the bend.
|
|
|
|
\item Calculate area of the polygon $P$.
|
|
|
|
\item Calculate perimeter of the polygon $u$. The same value is the
|
|
circumference of the circle.
|
|
|
|
\item Given circle's perimeter $u$, circle's area $A$ is:
|
|
|
|
\[
|
|
A = \frac{u^2}{4\pi}
|
|
\]
|
|
|
|
\item Compactness index is $\nicefrac{P}{A}$:
|
|
|
|
\[
|
|
cmp = \frac{P}{A} = \frac{P}{ \frac{u^2}{4\pi} } = \frac{4\pi P}{u^2}
|
|
\]
|
|
|
|
\end{enumerate}
|
|
|
|
Other than that, once this section is implemented, each bend will have a list
|
|
of properties, upon which actions later will be performed.
|
|
|
|
\subsection{Shape of a Bend}
|
|
|
|
This section introduces \textsc{adjusted size}, which trivially derives from
|
|
\textsc{compactness index} $cmp$ and shape's area $A$:
|
|
|
|
\[
|
|
adjsize = \frac{0.75 A}{cmp}
|
|
\]
|
|
|
|
Adjusted size becomes necessary later to compare bends with each other, and
|
|
find out similar ones.
|
|
|
|
\subsection{Isolated Bend}
|
|
|
|
Bend itself and its "isolation" can be described by \textsc{average curvature},
|
|
which is \textcquote{wang1998line}{geometrically defined as the ratio of
|
|
inflection over the length of a curve.}
|
|
|
|
Two conditions must be true to claim that a bend is isolated:
|
|
|
|
\begin{enumerate}
|
|
\item \textsc{average curvature} of neighboring bends, should be larger
|
|
than the "candidate" bend's curvature; this implementation arbitrarily
|
|
chose $\isolationThreshold$.
|
|
|
|
\item Bends on both sides of the "candidate" should be longer than a
|
|
certain value. This implementation does not (yet) define such a
|
|
constraint and will only follow the average curvature constraint above.
|
|
\end{enumerate}
|
|
|
|
\subsection{The Context of a Bend: Isolated and Similar Bends}
|
|
|
|
To find out whether two bends are similar, they are compared by 3 components:
|
|
|
|
\begin{enumerate}
|
|
\item \textsc{adjusted size}
|
|
\item \textsc{compactness index}
|
|
\item Baseline length
|
|
\end{enumerate}
|
|
|
|
These 3 components represent a point in the 3-dimensional space, and Euclidean
|
|
distance $d$ between those is calculated to differentiate between bends $p$ and
|
|
$q$:
|
|
|
|
\[
|
|
d(p,q) = \sqrt{(adjsize_p-adjsize_q)^2 +
|
|
(cmp_p-cmp_q)^2 +
|
|
(baseline_p-baseline_q)^2}
|
|
\]
|
|
|
|
The more similar the bends are, the smaller the distance $d$.
|
|
|
|
\subsection{Elimination Operator}
|
|
|
|
\subsection{Combination Operator}
|
|
|
|
\subsection{Exaggeration Operator}
|
|
|
|
\section{Program Implementation}
|
|
|
|
\section{Results of Experiments}
|
|
|
|
\section{Conclusions}
|
|
\label{sec:conclusions}
|
|
|
|
\section{Related Work and future suggestions}
|
|
\label{sec:related_work}
|
|
|
|
\printbibliography
|
|
|
|
\begin{appendices}
|
|
|
|
\section{Code listings}
|
|
|
|
\subsection{Reproducing the generalizations in this paper}
|
|
|
|
We strongly believe in the ability to reproduce the results is critical for any
|
|
scientific work. To make it possible for this paper, all source files and
|
|
accompanying scripts have been attached to the PDF. To re-generate this
|
|
document and its accompanying graphics, run this script (assuming name of
|
|
this document is {\tt mj-msc-full.pdf}):
|
|
|
|
\inputminted[fontsize=\small]{bash}{extract-and-generate}
|
|
|
|
This was tested on Linux Debian 11 with upstream packages only.
|
|
|
|
\subsection{Algorithm code listings}
|
|
\inputminted[fontsize=\small]{postgresql}{wm.sql}
|
|
|
|
\end{appendices}
|
|
\end{document}
|