alisons⭒computer

Trail of Prayers

2024Flask, UMAP, NumPy, GloVe, scikit-learn

website , github repo

Trail of Prayers is an animated interface designed to generate, display, and visualize text, integrating both back-end and front-end components to create an interactive user experience. Utilizing the Flask, the back-end handles the generation of prayer sentences through a combination of natural language processing and machine learning techniques, while the front end employs JavaScript to create dynamic visual elements – a constellation of stars representing each sentence oriented through Uniform Manifold Approximation and Projection (UMAP), and a trail of sentences in continuous sequence, linking each to its most semantically similar line in a 2000+ line corpus from 8 female mystics. By leveraging a series of interconnected scripts—each specializing in specific functionalities, such as sentence generation, data processing, and visual representation— Trail of Prayers illustrates a poetic and artistic exploration and visualization through text analysis.

Project backend

server.py

[ project stack flow ]

Server.py implements a basic web framework for using Flask, designed to generate and display prayer sentences. It begins by importing necessary modules, including Flask for creating the web application, ‘multiprocessing’ for handling separate processes, and the local python script ‘prayers.py’ that generates each sentence to visualize. The Flask application is initialized, and a multiprocessing queue is created to facilitate communication between the main Flask process and the function call from ‘prayers.py’ responsible for generating sentences.

Two routes are defined: the root route (‘/’), which renders the ‘index.html’ template as the main page, and the ‘/get_text’ route, which routinely checks the queue for generated elements. If an element exists, it retrieves it and returns it as a JSON response; if the queue is empty, it responds with None. This AJAX-enabled endpoint allows the front end JavaScript to fetch sentences dynamically without reloading the page.

A separate process is initiated to run the ‘generate_sentences’ function from the ‘prayers.py’ script, passing necessary arguments. This enables concurrent execution, allowing sentence generation without blocking the web server. The script efficiently provides a framework for generating prayer sentences asynchronously while serving them to the front-end client through a JSON API.

The script efficiently provides a framework for generating prayer sentences asynchronously while serving them to the front-end client through a JSON API. However, improvements can be made in error handling, graceful shutdown of background processes, scalability with more robust message-passing systems, and security measures to safeguard user inputs. Overall, the script establishes a responsive web application that effectively integrates background processing and dynamic content delivery.

reading_text.py

This python script processes the text files containing literature from mystics, filtering and formatting the content for output into a CSV file. It begins by importing necessary libraries, including ‘nltk’ for natural language processing, specifically to tokenize sentences, and ‘csv’ for writing data. The ‘main’ function encapsulates the script’s logic, and includes a helper function, ‘filter_strings’, that filters out empty strings or those consisting solely of punctuation. The script reads each text file, splits the content into paragraphs, and creates paired lines from alternating lines within those paragraphs, which are then cleaned of newlines and filtered using `filter_strings`. For each valid line, a dictionary is constructed to store the title of the writing, the author’s name, and the prayer line, which is appended to the ‘prayerData’ list, a comprehensive dataset of each sentence, its author, and its source. The script processes 8 authors, including Mirabai, Mirra Alfassa, and St. Teresa of Ávila, using a similar structure for each, which are refactored for better maintainability. After collecting all the data, it writes ‘prayerData’ into a CSV file using ‘csv.DictWriter’, ensuring a structured output.

prayers.py

[ prayer trail ]

Prayers.py is the primary script that constructs the sentence sequence, selecting a random line to begin it. It processes and analyzes a corpus of sentences using various natural language processing techniques and machine learning algorithms. It begins by importing necessary modules and libraries, including NLTK for text processing, NumPy for numerical operations, and data structures from the ‘datasketch’ library for locality-sensitive hashing (LSH). The script reads processed prayer data from a CSV file created by ‘reading_text.py’, extracting prayer lines and associated titles and authors into separate lists.

A custom function, ‘load_glove_embeddings’, is defined to load pre-trained Global Vectors for Word Representation (GloVe) word embeddings from a specified file, ‘glove.6B.300d.text’, converting each word into a corresponding vector representation. The script defines a function, ‘sentence_embedding’, that computes the embedding for an entire sentence by averaging the embeddings of its constituent words, ignoring words not found in the GloVe model. The ‘preprocess’ function tokenizes input sentences, filters out common English stop words as well as specific words related to religious texts (like “thy”, “thou”, “hast”, and “thee”), and returns a cleaned-up version of the sentence. To find the most similar sentence, the `find_most_similar` function employs the LSH technique. It preprocesses the input sentence, computes its embedding, and uses MinHash to create a probabilistic signature of the sentence. The function then queries the LSH structure for similar sentences, calculating their cosine similarity to the input sentence to identify the most similar one.

The script randomly selects an initial sentence from the list of prayer sentences, which serves as the starting point for generating a morphing sequence of sentences. It maintains a record of visited sentences to avoid repetition. The `generate_sentences` function iteratively finds the most similar sentence to the current sentence and adds it to the morphing sequence until all sentences have been visited.

Additionally, the script employs Uniform Manifold Approximation and Projection (UMAP) to reduce the dimensionality of the sentence embeddings, facilitating visualization. The ‘getPrayerCoords’ function computes the embeddings for the prayer sentences, applies UMAP to obtain two-dimensional coordinates, and then writes this data, including the original sentences and their corresponding titles and authors, to a JSON file for potential use in visualizations or further analysis. The script effectively integrates techniques for processing textual data, embedding sentences in a semantic space, and identifying relationships between sentences based on their content.

Project frontend

[ main page ]

index.html (main page)

The main page seamlessly integrates two distinct features to form a cohesive landscape. The upper section symbolizes the "sky," where a constellation of connected stars represents each sentence in the corpus. These stars are arranged using a UMAP configuration, with lines connecting the sentences in order. The lower section embodies the "land," where sentences create a flowing trail, surrounded by a field of flowers.

prayers.js

Prayers.js creates an interactive overlay that displays sentences sequentially, forming a trail with associated visuals (flowers) surrounding them like a field at the bottom of a webpage. It begins by selecting the necessary DOM elements, and initializes several variables to manage state, including the current sentence index and the positioning direction of the sentences. To configure the sentences in a zig zag trail pattern, the ‘applyZigZagPositioning’ function calculates the horizontal and vertical positioning of newly added sentence elements in a zigzag pattern, dynamically adjusting their position based on the current index and a specified direction. The ‘adjustFontSizeOnScroll’ function modifies the font size of each sentence based on its distance from the top of the overlay, where sentences further from the bottom appear smaller, enhancing the illusion of depth, during scrolling. The ‘addSentence’ function is responsible for adding new sentences to the overlay, updating the unique words tracked, and applying positioning and visual effects. It also handles mouse hover events that reveal associated flowers for each word in a sentence. The ‘updateUniqueWords’ function processes each sentence to extract words, creating and revealing corresponding flower elements while updating a set of unique words. Each flower element is styled and positioned according to its relation to the position of the sentence it represents.

The ‘positionFlower’ function determines the placement of flowers relative to the sentence's position, ensuring they are appropriately spaced. The size of flowers and their captions adjusts based on their position within the overlay during scroll events. The code includes event listeners for scrolling and resizing, ensuring that elements remain responsive to changes in the viewport. Additionally, a recurring AJAX call to retrieve new sentences and update the overlay occurs every four seconds. Overall, the code combines asynchronous data handling with dynamic DOM manipulation, ensuring a responsive user experience.

prayercoords.js

[ source index ]

The provided JavaScript code is designed to visualize prayer data as points in a 2D space, employing asynchronous functions for efficient data fetching and manipulation. The ‘getCoordData’ function retrieves prayer coordinate data from a local JSON file, incorporating error handling to manage potential fetch failures. The ‘getMinMaxRanges()’ function iterates through the data to determine the minimum and maximum x and y values, which are used to scale the position of the coordinates. The ‘scaleData’ function rescales the coordinates to fit within the window dimensions, normalizing the data for visual representation. The ‘createPrayerPoints’ function generates ‘article’ HTML elements for each prayer point, applying CSS transforms to position them according to the scaled coordinates, while managing a dictionary (‘writerNameLists’) to group elements by writer name. The ‘connectPoints’ function creates lines between prayer points, calculating distances and angles to align the lines properly. The ‘slidingScale’ function introduces an interactive slider that sequentially highlights each sentence in the UMAP according to the dataset's original order.

Trail of Prayers represents a comprehensive exploration of computational geometry and visual aesthetics with an effective integration of natural language processing analysis. The significance of this work lies in its ability to seamlessly merge functionality with creativity. Through the use of Flask and advanced text processing techniques, the back-end dynamically generates content for visualization on the front end. The front-end enhances user interaction, enabling a visually rich experience that displays textual data in an insightful and interactive way. The attention to detail in ensuring non-overlapping elements, smooth animations, and a fluid user interface underlines the robustness of the code and its adaptability to various use cases. The project highlights the potential of combining precise calculations with artistic expression, showcasing the ability to transform a simple web page into an interactive and visually engaging experience.