Turkish Hyphenation in Common Lisp

Description of the problem

Unlike English, Turkish has a very regular orthography especially when it comes to hyphenation rules. Today, I am going to write a common lisp program that gives you the proper Turkish hyphenation of a Turkish word. Back in the day, I wrote the original in C for ISO8859-9 encoding. I should extend it for unicode, but dealing with unicode in C feels like cleaning hair from shower drain.

The code

First, I need a function which would tell us whether a given character is a vowel or a consonant:

(let ((vowels '(#\a #\â #\e #\ı #\i #\o #\ö #\u #\ü #\û)))
  (defun vowelp (ch)
    (member ch vowels)))

VOWELP

Next, the function that hypenates a given word:

(defun hyphen (raw)
  (let ((w (format nil "~a  " raw))
        res flag dash)
    (dotimes (i (length raw))
      (if (vowelp (elt w i))
          (setf flag nil
                dash (some #'vowelp (subseq w (1+ i) (+ i 3))))
          (when (not (or (setf flag (not flag))
                         (setf dash (not (vowelp (elt w (1+ i)))))))
            (push #\- res)))
      (push (elt w i) res)
      (when dash (push #\- res))
      (setf dash nil))
    (ppcre:split #\- (concatenate 'string (reverse res)))))

HYPHEN

And few tests:

(mapcar #'hyphen '("işkillendim" "ağrılarımsa" "erkekle" "tarımsal" "yap" "üre" "tank" "ionya" "çekoslavakyalılaştıramadıklarımızdanmışcasınaymışsa"))

((iş kil len dim) (ağ rı la rım sa) (er kek le) (ta rım sal) (yap) (ü re)
 (tank) (i on ya)
 (çe kos la vak ya lı laş tı ra ma dık la rı mız dan mış ca sı nay mış sa))

Older Posts

[2025-02-23] Counting Matroids

[2025-02-12] Sampling from a Random Variable

[2025-01-29] Markov Numbers

[2024-12-24] Number of isomorphism classes of simple graph (continued)

[2024-12-22] Counting Isomorphism Classes of Graphs

[2024-11-25] Connected Components of Graphs

[2024-11-24] Counting connected components of a graph

[2024-11-18] Counting Isomorphism Classes of m-ary Trees

[2024-11-16] Number of Isomorphism Classes of Ternary Trees

[2024-11-12] Hosoya Index of Balanced Binary Trees

[2024-11-11] Hosoya Index of a Graph

[2024-10-29] The Clique Number of a Simple Graph

[2024-10-28] The Size of Maximally Independent Subsets in a Graph

[2023-11-03] Graph Algorithms in JGraphT with Common Lisp

[2023-10-28] An Implementation of Pandas’ cut and qcut in Lisp

[2023-07-24] A Collatz-like Conjecture for the Projective Line

[2023-03-06] Twin Primes, Cousin Primes, Sexy Primes, and Prime Triplets

[2023-03-02] Set of All Partitions of a Finite Set

[2023-02-14] Non-crossing Partitions and Dyck Words

[2023-02-13] Non-crossing Linear Chords

[2023-02-04] Clojure/Python Interop Examples

[2023-01-14] Graph Algorithms in Clojure with JGraphT

[2022-03-29] 2D-Random Walk

[2022-03-28] Trade Deficit vs Exchange Rate Curve

[2022-03-16] Working with World Bank Data in Python

[2022-03-09] Working with European Central Bank data in python (revisited)

[2022-01-24] A Clique Analysis of Quakers in early modern Britain (1500-1700)

[2021-12-05] Boyer–Moore and Misra-Gries Algorithms in Clojure

[2021-09-12] Tension in Text Plotted

[2021-09-02] Statistical Distributions using Apache Commons Math in Clojure

[2021-08-31] Reduce with Intermediate Results in Common Lisp

[2021-08-21] Multivariate Regression Implemented in Clojure

[2021-05-29] Using Neural Networks to Detect Graph Properties

[2021-04-17] Fast Null-Space Calculation via LU-Decomposition

[2021-02-24] Stoer-Wagner Algorithm in Clojure

[2021-02-19] Calculating Vertex Covers in Clojure

[2021-02-18] Listing All Paths in a Graph

[2021-02-14] Strict Dyck Words and Fibonacci Numbers

[2021-02-14] Kruskal’s Algorithm in Common Lisp

[2021-02-13] Kruskal's Algorithm Implemented in Clojure

[2021-02-10] An integer dynamical system of integers

[2021-02-08] Binary Symmetrization

[2021-01-28] Prüfer Encoding and Decoding of a Tree in Clojure

[2021-01-27] Counting Cycle-Free Paths in a Graph

[2021-01-27] Counting Connected Labeled Graphs

[2020-12-18] Counting Graphs with a Prescribed Degree Sequence

[2020-12-13] Havel–Hakimi Algorithm in Clojure

[2020-12-12] Havel–Hakimi Algorithm in Common-Lisp

[2020-10-23] The Quadratic Casimir Element

[2020-07-04] Collatz Sequence in Binary

[2020-07-02] A Lazy Sequence of Primes in Clojure

[2020-06-10] Yet Another Fizz-Buzz in Common Lisp

[2020-05-12] ECB Data with Clojure and Vega-Lite

[2020-05-06] Processing ECB Data with Common Lisp

[2020-04-17] Next Permutation in the Lexicographical Ordering

[2020-04-13] Turkish Hyphenation in Common Lisp

[2020-04-01] Using JavaPlex with Clojure

[2019-11-05] Constricted Arithmetic Progressions

[2019-11-03] The Number of Arithmetic Progressions of Integers

[2019-05-06] Bron-Kerbosch Algorithm in Clojure

[2019-05-01] An Implementation of Ford-Fulkerson Algorithm in Clojure

[2019-04-22] Document Summarization via Nonnegative Matrix Factorization

[2019-04-20] Latent Semantic Analysis in Clojure

[2019-04-13] K-Nearest Neighbors Algorithm in Clojure

[2019-04-06] K-Means Implemented in Clojure

[2019-03-19] Prüfer Encoding/Decoding of a Tree in Common Lisp

[2019-03-05] Gale-Shaply Algorithm in Common Lisp

[2019-03-02] Calculating The Correct Rank of a Matrix

[2018-12-04] Feed-forward and back-propagation in neural networks as left- and right-fold

[2018-10-31] Nonnegative Matrix Decomposition in Clojure

[2018-10-30] Non-negative Matrix Decomposition in Scala

[2018-08-30] Working with European Central Bank Data in Scala

[2018-07-30] Perverse Sequences

[2018-05-28] Online Perceptron in Common Lisp

[2018-05-26] Online Perceptron

[2018-05-18] Online Regression

[2018-05-06] Knut’s Algorithm-S in Common Lisp

[2018-02-28] Irreducible Dyck Words

[2018-02-19] Optimization with GNU Scientific Library for Lisp

[2018-02-10] Van Eck’s Sequence

[2018-02-09] Hiring networks in mathematics

[2018-02-08] Linus Sequence

[2018-02-05] Egyptian Fractions

[2018-02-01] Listing all Young Tableaux

[2018-01-23] Collatz sequence (yet again)

[2018-01-15] Hofstadter's Q sequence

[2018-01-09] Farey Sequence

[2018-01-09] Catalan's Triangle

[2018-01-06] The Shoelace Formula for the Area of a Polygon

[2017-10-01] Working with European Central Bank Data in Python

[2017-09-27] Expected Value of the Diameter of a Tree

[2017-09-26] Using Quandl with kixi.stats on Clojure

[2017-09-22] Using Quandl with Common Lisp

[2017-08-05] Solving Linear Equations in Natural Numbers

[2017-07-31] Transitive Closure of a Directed Graph or a Relation

[2017-07-20] Steenrod-Milnor and Tournament Sequences

[2017-07-15] A lower bound on the radius of a graph

[2017-07-08] All partitions of an integer

[2017-07-06] Some Hasse Diagrams

[2017-07-04] Shuffles

[2017-07-03] Kaprekar Sequence

[2017-07-01] Lattice of Dyck Words

[2017-06-28] The poset of connected subgraphs of a connected graph

[2017-06-21] Calculating the Diameter and the Radius of a Graph Using Tropic Linear Algebra

[2017-06-19] Generating random regular graphs

[2017-06-14] Estimating the maximum element of a large list

[2017-06-09] A Stochastic Gradient Descent Implementation in Clojure

[2017-06-06] A topology problem

[2017-04-22] Listing duplicate files

[2017-03-14] My First Idris Proof

[2016-12-02] Distinguishing hash functions (part II)

[2016-12-01] Distinguishing hash functions

[2016-10-20] Hofstadter-Conway $10,000 sequence

[2016-08-18] A Solution for Problem 171 of 4Clojure

[2016-08-16] Puzzles and Group Theory

[2016-08-13] Using Weka within Lisp

[2016-07-12] Funniest and Unfunniest Jokes in the Jester Dataset

[2016-07-05] Generating Uniformly Random Connected Graphs

[2016-06-16] The Robinson-Schensted Algorithm

[2016-06-01] Conjugate Partitions

[2016-04-27] Using Word2Vec from Clojure

[2016-04-24] Using Word2Vec from Common Lisp

[2016-04-18] A Migration Analysis

[2016-04-11] Basic Data Analysis with CL without Frameworks

[2016-03-25] Parallel map-reduce in Common Lisp

[2016-02-22] Text Summarization and Topic Analysis

[2016-01-27] Set Covering Problem

[2016-01-25] Kolmogorov-Smirnov Test

[2016-01-20] Eigen-values of the Laplacian and Connected Components of a Graph

[2015-12-12] Dual Graphs

[2015-10-26] Longest Increasing Subsequence Revisited

[2015-10-16] Document Summarization via Markov Chains

[2015-10-07] Computational Literary Analysis

[2015-09-30] Library of Babel in Common Lisp

[2015-09-28] Merging Association Lists in Common Lisp

[2015-07-22] Cheapest Paths via Tropic Matrices

[2015-07-21] Hidden Markov Models via Tropic Matrices

[2015-07-08] A non-technical post

[2015-06-28] An implementation of the Viterbi algorithm in Common Lisp

[2015-05-28] Greatest Common Divisor of Two Rational Numbers

[2015-05-21] Partitions of Equal Measure Whatever the Measure May Be

[2015-05-14] Finding Cliques in a Graph

[2015-05-12] Set Cover Problem

[2015-05-03] Threading Macros in Common Lisp

[2015-05-03] Happy Numbers

[2015-05-01] Collatz Primes

[2015-04-23] Splitting Streams

[2015-04-06] Hamming Distance and Double Hashing

[2015-04-05] Hamming Distance and Hashing Functions

[2015-04-05] Hamming Derivative of Hashing Functions

[2015-04-02] A Topology Problem

[2015-03-21] Curve Fitting is a Gram-Schmidt Reduction

[2015-03-08] Maximum number of characters using keystrokes A, Ctrl+A, Ctrl+C and Ctrl+V

[2015-03-06] Eccentricity, Radius and Diameter in a Graph, Revisited

[2015-03-01] Graphs and Entropy

[2015-02-22] Math PhD Hiring Network (Part 3)

[2015-02-19] Math PhD Hiring Network (Part 2)

[2015-02-18] Math PhD Hiring Network (Part 1)

[2015-02-17] Faculty Networks and Inequality in Hiring Practices in Universities

[2015-02-10] Functional Streams in Lisp Explained

[2015-02-05] Collatz-type Conjectures (Continued)

[2015-02-04] Collatz-type Conjectures (Continued)

[2015-01-31] Collatz-type Conjectures (Continued)

[2015-01-30] Collatz-type Conjectures

[2015-01-28] Experiments with Infinite Recursive Sequences (continued)

[2015-01-17] Experiments with Infinite Recursive Sequences

[2015-01-10] Goldbach Pairs

[2015-01-02] Collatz Lengths (Continued)

[2015-01-01] Functional Streams

[2014-12-27] Polarization in the US Congress

[2014-12-18] Partition a sequence

[2014-11-28] Uniformly Random Permutations

[2014-11-22] An Implementation of Ford-Fulkerson Algorithm in Common Lisp

[2014-11-17] Tropic Calculation of Cheapest Paths

[2014-11-05] Longest common subsequence of two sequences

[2014-10-30] Counting Spanning Trees of a Graph

[2014-10-26] Longest Increasing Subsequence

[2014-10-24] The Number of Inversions in a Sequence

[2014-10-22] Hashes and Entropy

[2014-10-09] Estimating Cardinality with Constant Memory Complexity

[2014-09-30] Landau's Function

[2014-09-29] A Problem on Substitution Ciphers and Group Theory

[2014-09-28] A Morse Code Translator

[2014-09-23] A Memoization Macro for Common Lisp

[2014-09-21] Reducers are Monoid Morphisms

[2014-09-18] Number of isomorphism classes of binary trees

[2014-09-07] CONS is your friend

[2014-08-22] A Zipf's Law Simulation

[2014-08-07] Generating Uniformly Random Trees

[2014-07-09] A Solution for Project Euler #463

[2014-06-12] Entropy of truncated MD5 hashing

[2014-06-08] Hexadecimal digits of π

[2014-02-11] Information content of n-grams

[2014-02-08] Turkish Sentiment Analysis Using Thesaurus Distance

[2014-02-01] Sentiment analysis using word distances

[2014-01-27] Phase transitions in entropy

[2013-12-13] Optimal length of n-grams

[2013-12-10] Counting strings that contain intervals of same letter repetitions

[2013-12-02] Patterns Separating Large Texts

[2013-11-23] Collatz Sequences (Continued)

[2013-11-11] Entropy and approximately one-to-one maps

[2013-10-23] Tree Isomorphism

[2013-10-15] Self Organizing Maps

[2013-09-15] Euler Project #401

[2013-09-15] An additively recursive definition of the Moebius function

[2013-09-11] An Unsuccessful Attempt for Solving Euler Project #401

[2013-09-04] Uniform Sampling from Parametrized Submanifolds in Scala

[2013-09-04] Uniform Sampling from Parametrized Submanifolds

[2013-08-30] Randomly Generated Points Obeying a Distribution

[2013-08-25] Simulated Annealing in Lisp

[2013-08-21] Eigenvalues and Eigenvectors in GSLL

[2013-08-16] Reservoir Sampling

[2013-08-11] Gibbs sampling in lisp compared with C

[2013-08-10] Logistic Regression in lisp

[2013-08-10] Linear Discriminant Analysis in R

[2013-07-17] A Gradient Descent Implementation in Lisp

[2013-07-01] k-Nearest Neighbor Classification Algorithm Implemented in Lisp

[2013-05-19] Newton-Raphson Method

[2013-05-07] Levenshtein Distance

[2013-04-15] Cut points in a graph

[2013-04-01] Experiments on Collatz Lengths (Continued)

[2013-02-18] The sound of the torsion parts of homotopy groups of spheres

[2013-02-12] Monadic Units

[2013-02-07] Distribution of Collatz Lengths (continued)

[2013-02-03] Distribution of Collatz Lengths

[2013-01-31] Quotients of polynomial algebras

[2013-01-12] Path ideals

[2013-01-10] McCarthy91 Terminates

[2013-01-09] Finding all paths in a directed graph

[2013-01-04] A Simple Monte-Carlo Integration Implementation in Lisp

[2012-12-30] A simple problem in Kolmogorov-Chaitin complexity

[2012-12-29] From walks to paths

[2012-12-16] Higher order functions, functors and monads

[2012-12-13] Eccentricity, Radius and Diameter in an Undirected Graph

[2012-11-29] Untitled

[2012-11-25] Strictly Increasing Labels of Directed Graphs

[2012-11-19] Strictly Increasing Labellings of Directed Graphs

[2012-11-17] Nilpotent elements in an artinian algebra

[2012-11-04] Local rings, idempotents and non-invertible elements

[2012-10-18] An implementation of the fixed-radius near neighbor clustering algorithm in lisp

[2012-10-15] Reducing directed graphs

[2012-10-10] An implementation of the k-means clustering algorithm in lisp

[2012-10-08] A comparison of different map functions in lisp

[2012-10-03] Source code entropy

[2012-09-28] Collisions in random walks

[2012-09-26] Transitive closure of a directed graph

[2012-09-26] Solving linear equations in ℕ

[2012-09-26] Listing partitions

[2012-09-26] Inverting formal power series

[2012-09-26] Hasse subgraph of a directed graph