\documentclass[a4paper]{article}

\iffalse
\usepackage[L7x,T1]{fontenc}
\usepackage[lithuanian]{babel}
\else
\usepackage[T1]{fontenc}
\usepackage[english]{babel}
\fi

\usepackage[utf8]{inputenc}
\usepackage{a4wide}
\usepackage{csquotes}
\usepackage[maxbibnames=99,style=authoryear]{biblatex}
\usepackage[pdfusetitle]{hyperref}
\usepackage{enumitem}
\usepackage[toc,page,title]{appendix}
\addbibresource{bib.bib}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{gensymb}
\usepackage{varwidth}
\usepackage{tabularx}
\usepackage{float}
\usepackage{tikz}
\usepackage{minted}
\usetikzlibrary{er,positioning}
\definecolor{mypurple}{RGB}{117,112,179}
\input{version}

\newcommand{\DP}{Douglas \& Peucker}
\newcommand{\VW}{Visvalingam--Whyatt}
\newcommand{\WM}{Wang--M{\"u}ller}

\title{
    Cartographic Generalization of Lines using free software \\
    (example of rivers) \\ \vspace{4mm}
}

\iffalse
https://bost.ocks.org/mike/simplify/
http://bl.ocks.org/msbarry/9152218

small scale: 1:XXXXXX
large scale: 1:XXX

a4: 210x297mm
a5: 148x210mm
a6: 105x148xmm
a7: 74x105mm
a8: 52x74mm

Crossing:
Xmin: 623306
Ymin: 6109635
Xmax: 625526
Ymax: 6111210
623306 6109635 625526 6111210
Crossing wxh: 2220, 1575 (m)


connect rivers first to a single polylines:
- some algs can preserve connectivity, some not.

ideal hypothesis: mueller algorithm + topology may fully realize cartographic generalization tasks.

what scales and what distances?

= Intro: Aktualumas
FOSS nėra realizuotas tinkamas kartografinio realizavimo algoritmas (2–3 sakiniai). Kad kartografai turėtų
įrankį upių generalizavimui.

Bazė: imame tai, ką dabar turi kartografai įrankių paletėj.

Imti mažus upės vingius. Paimti mažas atkarpėles ir palyginti su originalia.
Todėl, kad nėra kilpų.

Zeimena extents: [606922,6097557,627230,6126362]
20308 x 28805 (w x h)

\fi

\author{Motiejus Jakštys}

\date{
    \vspace{10mm}
    Version: \VCDescribe \\ \vspace{4mm}
    Generated At: \GeneratedAt
}

\begin{document}
\maketitle

\begin{abstract}
\label{sec:abstract}
Current open-source line generalization solutions have their roots in
    mathematics and geometry, thus emit poor cartographic output. Therefore, if
    one is using open-source technology to generalize cartographic objects,
    their downscaled counterparts will be incorrectly scale-adjusted. This
    paper explores the available down-scaling implementations, highlights some
    of their deficiencies, and suggests a viable algorithm for an avid GIS
    developer. Once the new algorithm becomes usable from within open-source
    GIS software (e.g. QGIS or PostGIS), small-scale maps created by free
    software will have a chance to be of higher quality.
\end{abstract}

\newpage

\tableofcontents
\listoffigures

\newpage

\section{Introduction}
\label{sec:introduction}

A number of cartographic line generalization algorithms have been researched,
which claim to better process cartographic objects like lines. These fall into
two rough categories:
\begin{itemize}
    \item Cartographic knowledge was encoded to an algorithm (bottom-up
        approach). One among these are \cite{wang1998line}.
    \item Mathematical shape transformation which yields a more
        cartographically suitable down-scaling. E.g. \cite{jiang2003line},
        \cite{dyken2009simultaneous}, \cite{mustafa2006dynamic},
        \cite{nollenburg2008morphing}.
\end{itemize}

During research for the mentioned articles, prototype code has been written for
most of the algorithms. However, none of them seem to be available for use
except for the two "classical" ones -- {\DP} and {\VW}.

\cite{wang1998line} is an algorithm specifically created for cartographic
generalization and available for general use, though it is only currently
available in a commercial product. This poses a problem for map creation in
open source software: there is not a similar high-quality simplification
algorithm to create down-scaled maps, so any cartographic work, which uses line
generalization as part of its processing, will be of sub-par quality.
We believe that availability of high-quality open-source tools is an important
foundation for future cartographic experimentation and development, thus it
it benefits the cartographic society as a whole.

This paper will be reviewing and comparing two widely available algorithms that
are often used for line generalization:
\begin{itemize}
    \item \cite{douglas1973algorithms} via
        \href{https://postgis.net/docs/ST_Simplify.html}{PostGIS Simplify}.

    \item \cite{visvalingam1993line} via
        \href{https://postgis.net/docs/ST_SimplifyVW.html}{PostGIS SimplifyVW}.

\end{itemize}

Since both algorithms produce jaggy output lines, it is worthwhile to process
those through a widely available \cite{chaikin1974algorithm} smoothing
algorithm via \href{https://postgis.net/docs/ST_ChaikinSmoothing.html}{PostGIS
ChaikinSmoothing}.

Review of the available algorithms will be followed by desiderata for a
possible open-source addition. In the end, we will issue a recommendation,
which algorithm can be picked up and implemented by an avid GIS developer.

\section{Visual comparison}

Lakaja and large part of Žeimena (see figure~\ref{fig:zeimena} on
page~\pageref{fig:zeimena}) will be used as inputs to the generalization
algorithms, because the river exhibits both both straight and curved shape, is
a combination of two curly rivers, and author's familiarity with the location.

Since the map area is large (approx. 20km by 28km, scale $1:300 000$), we will
also review a subset of the area of approx 2200m by 1575m. The zoomed-in
version will help explain some of the deficiencies in the reviewed algorithms.

\begin{figure}[H]
    \centering
    \includegraphics[width=67.5mm]{zeimena}
    \caption{Lakaja and Žeimena, with marked river crossing area, $1:300 000$}
    \label{fig:zeimena}
\end{figure}

\begin{figure}[h]
    \centering
    \includegraphics[width=74mm]{crossing}
    \caption{River crossing area zoomed in, $1:30 000$}
    \label{fig:crossing}
\end{figure}

\section{Comparison algorithms and parameters}
\label{sec:algs-and-params}

To visually evaluate the Žeimena sample, examples for {\DP} and {\VW}
were created using the following parameters:

\begin{enumerate}[label=(\Roman*)]
    \item {\DP} tolerance: $tolerance := 125 * 2^n, n = 0,1,...,5$.
    \item {\VW} tolerance: $vwtolerance = tolerance ^ 2$\label{itm:2}.
\end{enumerate}

Tolerance squared, i.e. the parameter~\ref{itm:2} requires explanation.
Tolerance for {\DP} is specified in linear units, in this case, meters.
Tolerance for {\VW} is specified in area units $m^2$. As author was not able to
locate formal comparisons between the two (i.e. how to calculate one tolerance
value from the other, so the results are comparable?), {\DP} tolerance was
arbitrarily squared and fed to {\VW}. To author's eye, this provides comparable
and reasonable results, though could be researched.

Chaikin's smoothing algorithm was generated using $nIterations = 5$. That
number was chosen for better visual appeal at the expense of computational
power. Smaller number iterations would cause retain visible angles, whereas
larger number of iterations, like 5 (PostGIS supports values from 1 to 5),
causes the resulting lines to be very smooth.

As can be observed in table~\ref{tab:comparison-zeimena} on
page~\pageref{tab:comparison-zeimena}, both simplification algorithms convert
bends to chopped lines. This is especially visible in tolerances 256 and 512.
In a more robust simplification algorithm, the larger tolerance, the larger the
bends on the original map should be retained.

\begin{figure}[H]
    \renewcommand{\tabularxcolumn}[1]{>{\center\small}m{#1}}
    \begin{tabularx}{\textwidth}{ p{2.1cm} | X | X | }
        Tolerance DP/VW                                                   &
        {\DP}                                                             &
        {\VW}                                                             \tabularnewline \hline

        128/16384                                                         &
        \includegraphics[width=\linewidth]{zeimena_douglas_128}           &
        \includegraphics[width=\linewidth]{zeimena_visvalingam_128}       \tabularnewline \hline

        256/65536                                                         &
        \includegraphics[width=.5\linewidth]{zeimena_douglas_256}         &
        \includegraphics[width=.5\linewidth]{zeimena_visvalingam_256}     \tabularnewline \hline

        512/262144                                                        &
        \includegraphics[width=.25\linewidth]{zeimena_douglas_512}        &
        \includegraphics[width=.25\linewidth]{zeimena_visvalingam_512}    \tabularnewline \hline

        1024/1048576                                                      &
        \includegraphics[width=.125\linewidth]{zeimena_douglas_1024}      &
        \includegraphics[width=.125\linewidth]{zeimena_visvalingam_1024}  \tabularnewline \hline

        2048/4194304                                                      &
        \includegraphics[width=.0625\linewidth]{zeimena_douglas_2048}     &
        \includegraphics[width=.0625\linewidth]{zeimena_visvalingam_2048} \tabularnewline \hline

        4096/16777216                                                     &
        \includegraphics[width=.0625\linewidth]{zeimena_douglas_4096}     &
        \includegraphics[width=.0625\linewidth]{zeimena_visvalingam_4096} \tabularnewline \hline
    \end{tabularx}
    \caption{{\DP} and {\VW} on Žeimena}
    \label{tab:comparison-zeimena}
\end{figure}

To ease the discussion on shapes in the resulting output, it is useful to
define what a "blunt bend" is: it is a river bent that looks like a cutout from
a large circle, illustrated in figure~\ref{fig:blunt-bent}.

\begin{figure}[h]
    \centering
    \begin{tikzpicture}
        \draw[color=mypurple] (-5,0) -- (-3, 0) ;
        \draw[color=mypurple] (0,0) arc (60:120:3) ;
        \draw[color=mypurple] (0,0) -- (2, 0) ;
    \end{tikzpicture}
    \caption{Blunt bent}
    \label{fig:blunt-bent}
\end{figure}

Once zoomed in to the river crossing area with {\DP} and {\VW} applied, it
becomes apparent that both large blunts are normalized to single lines, the
shape becomes jagged and unpleasant for the eye. See
table~\ref{tab:comparison-crossing} on page~\pageref{tab:comparison-crossing}.

\begin{figure}[h]
    \renewcommand{\tabularxcolumn}[1]{>{\center\small}m{#1}}
    \begin{tabularx}{\textwidth}{ p{2.1cm} | X | X | }
        Tolerance DP/VW                                                      &
        {\DP}                                                                &
        {\VW}                                                                \tabularnewline \hline

        128/16384                                                            &
        \includegraphics[width=\linewidth]{overlaid_zeimena_douglas_128}     &
        \includegraphics[width=\linewidth]{overlaid_zeimena_visvalingam_128} \tabularnewline \hline

        256/65536                                                            &
        \includegraphics[width=\linewidth]{overlaid_zeimena_douglas_256}     &
        \includegraphics[width=\linewidth]{overlaid_zeimena_visvalingam_256} \tabularnewline \hline

        512/262144                                                            &
        \includegraphics[width=\linewidth]{overlaid_zeimena_douglas_512}      &
        \includegraphics[width=\linewidth]{overlaid_zeimena_visvalingam_512}  \tabularnewline \hline

    \end{tabularx}
    \caption{{\DP} and {\VW} on river crossing area}
    \label{tab:comparison-crossing}
\end{figure}

As the reader may observe, the output lines, especially with higher tolerances,
are jaggy. Higher-tolerance jaggy outputs from {\VW} and {\DP}, passed through
Chaikin with 5 iterations, are displayed in table~\ref{tab:chaikin-crossing} on
page~\pageref{tab:chaikin-crossing}.

\begin{figure}[h]
    \renewcommand{\tabularxcolumn}[1]{>{\center\small}m{#1}}
    \begin{tabularx}{\textwidth}{ p{2.1cm} | X | X | }
        Tolerance DP/VW                                                              &
        {\DP} + Chaikin(5)                                                           &
        {\VW} + Chaikin(5)                                                           \tabularnewline \hline

        128/16384                                                                    &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_douglas_128}     &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_visvalingam_128} \tabularnewline \hline

        256/65536                                                                    &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_douglas_256}     &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_visvalingam_256} \tabularnewline \hline

        512/262144                                                                   &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_douglas_512}     &
        \includegraphics[width=\linewidth]{overlaid_chaikin_zeimena_visvalingam_512} \tabularnewline \hline

    \end{tabularx}
    \caption{Chaikin-smoothened {\DP} and {\VW} on river crossing area}
    \label{tab:chaikin-crossing}
\end{figure}

There is another issue on the wishlist beyond jaggedness and loss of large bents
-- combining close bends to larger ones.

\subsection{Combining bends}

Imagine there are two small bends close to each other, similar to
figure~\ref{fig:sinewave2} on page~\pageref{fig:sinewave2}, and one needs to
generalize it. The bends are too large to ignore replace them with a straight
line, but too small to retain both and retain their complexity.

According to cartographic generalization rules
(\cite{miuller1995generalization}), consecutive small bends should be combined
into larger bends. {\WM} encoded this process to an algorithm.

\begin{figure}[h]
    \centering
    \includegraphics[width=52mm]{sinewave2}
    \caption{Example river bend that should be generalized}
    \label{fig:sinewave2}
\end{figure}

When one applies {\DP} to figure~\ref{fig:sinewave2}, either both bends remain,
or become a straight line, see table~\ref{tab:comparison-sinewave2} on
page~\pageref{tab:comparison-sinewave2}.

\begin{figure}[h]
    \renewcommand{\tabularxcolumn}[1]{>{\center\small}m{#1}}
    \begin{tabularx}{\textwidth}{ p{1.5cm} | X | X | }
        Tolerance DP/VW                                                       &
        {\DP}                                                                 &
        {\VW}                                                                 \tabularnewline \hline

        2/4                                                                   &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_douglas_2}      &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_visvalingam_2}  \tabularnewline \hline

        16/256                                                                &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_douglas_16}     &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_visvalingam_16} \tabularnewline \hline

        32/1024                                                               &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_douglas_32}     &
        \includegraphics[width=\linewidth]{overlaid_sinewave2_visvalingam_32} \tabularnewline \hline

    \end{tabularx}
    \caption{{\DP} and {\VW} on example wave}
    \label{tab:comparison-sinewave2}
\end{figure}

Ideally, the double-bend in figure~\ref{fig:sinewave2} should be normalized to
a larger single-bend, similar to figure~\ref{fig:sinewave1} on
page~\pageref{fig:sinewave2}.

\begin{figure}[h]
    \centering
    \includegraphics[width=52mm]{sinewave1}
    \caption{Desired river bend generalization}
    \label{fig:sinewave1}
\end{figure}

To recap, both {\VW} and {\DP} simplify the lines, but their cartographic
output, when zoomed in, looks poorly to the human eye. Can a better solution be
found?

\section{Recommendation}
\label{sec:recommendation}

So far, we have reviewed two widely available open-source generalization
algorithms {\DP} and {\VW}, and now can enumerate the shortcomings:
\begin{itemize}
    \item Resulting generalized lines look jaggy and, when zoomed in,
        unpleasant to the eye.
    \item Blunt bends are generalized to straight lines, even though sometimes
        they should remain blunt bends (or even exaggerated bends).
    \item Consecutive small bends should be normalized into a larger bend.
\end{itemize}

According to \cite{wang1998line}, their algorithm fixes all 3 issues above. The
algorithm is relatively simple to understand for a non-expert cartographer
software developer, and thus should be feasible to implement in a few weeks.

\section{Conclusions}
\label{sec:conclusions}

We have evaluated two readily available line simplification algorithms using a
river sample and a synthetic bend: {\VW} and {\DP}. Once looking at the
examples, it is quite easy to see the most glaring deficiencies when applying
those two for comparing cartographic generalization.

We are suggesting to complement open-source list of
available algorithms with {\WM}, which was created for cartographic
generalization, and should fix the shortcomings identified in this paper.

\section{Related Work and future suggestions}
\label{sec:related_work}

\cite{stanislawski2012automated} studied different types of metric assessments,
such as Hausdorff distance, segment length, vector shift, surface displacement,
and tortuosity for the generalization of linear geographic elements. This
research can provide references to the appropriate settings of the line
generalization parameters for the maps at various scales.

As noted in parameter~\ref{itm:2} on page~\pageref{itm:2}, it would be useful
to have a formula mapping {\DP} tolerance to {\VW}. That way, visual
comparisons between line simplification algorithms could be more objective.

\printbibliography

\begin{appendices}

\section{Žeimena and Lakaja in context}

\begin{figure}[H]
    \centering
    \includegraphics[width=148mm]{zeimena-pretty}
    \caption{Lakaja and Žeimena river in context}
\end{figure}

\section{Code listings}

For the curious users it may be useful to see how the analysis was executed.
Also, given the source listings, it should be relatively straightforward to
re-run the same analysis on a different area.

The input files outside of these listings are {\tt zeimena.gpkg}, which is a
manually created GeoPackage containing Žeimena and Lakaja rivers, and the
\LaTeX\ report itself.

The analysis was executed and report was generated on Ubuntu 20.04 with only
system packages. This should be sufficient: {\tt postgis gdal-bin biber
latexmk texlive-bibtex-extra python3-geopandas python3-pygments}.

\subsection{douglas.sql}
Transforms a layer ({\tt :src}) to {\DP} using $tolerance$ tolerance into
{\tt :tbl} table.
\inputminted[fontsize=\small]{sql}{douglas.sql}

\subsection{visvalingam.sql}
Transforms a layer ({\tt :src}) to {\VW} using $tolerance^2$ tolerance into
{\tt :tbl} table.
\inputminted[fontsize=\small]{sql}{visvalingam.sql}

\subsection{chaikin.sql}
Smoothens a layer ({\tt :src}) using Chaikin's algorithm using $nIterations =
    5$ into {\tt :tbl} table. The parameters are explained in
    section~\ref{sec:algs-and-params} on page~\pageref{sec:algs-and-params}.

\inputminted[fontsize=\small]{sql}{chaikin.sql}

\subsection{fig2layer.py}
Creates figures (square, sine wave) as geopackage files.
\inputminted[fontsize=\small]{python}{fig2layer.py}

\subsection{Makefile}
This file binds all the pieces together:
\begin{itemize}
    \item Prepares the PostGIS database.
    \item Generates helper figures (sine waves, squares).
    \item Runs analysis on input files ({\DP} and {\VW}).
    \item Invokes {\tt latexmk} as a final report generation step.
\end{itemize}
\inputminted[fontsize=\small]{make}{Makefile}

\subsection{layer2img.py}
This file accepts a layer (or two) and generates a PDF image suitable for embedding into the report.
\inputminted[fontsize=\small]{python}{layer2img.py}

\subsection{managedb}
Manages a PostGIS database in the project directory. That way, the database can
be torn down and re-created by automated tools like the {\tt Makefile} itself.
You may need to update the paths in this script to suit your environment.
\inputminted[fontsize=\small]{bash}{managedb}

\end{appendices}
\end{document}