Three Algorithm Optimizations Outside [Place], [Other place]

  • The strings in the file are uppercase ASCII
  • We can keep a “stack” of characters we’ve seen, and the order in which they’ve been seen
  • If the first character in each string is somehow mapped to 0, and mappings for subsequent new characters that are “seen” increase by 1, patterns can trivially be compared to see whether they match.

Speeding Up Your Program, 101

“the means of labour passes through different metamorphoses, whose culmination is the machine”
[profile.release]
lto = true
[profile.bench]
lto = true

Opening a File and Getting Strings

fn file_to_lines<P>(filename: P) -> Vec<String>
where
P: AsRef<Path>,
{
let file = File::open(filename).expect("Couldn't open file");
let buf = BufReader::new(file);
buf.lines()
.map(|line| line.expect("Could not parse line"))
.collect()
}

Generating Patterns

The First Attempt

fn generate_pattern(hs: &str) -> Vec<u8> {
let mut stack = String::with_capacity(hs.len()).to_owned();
let mut pattern = Vec::with_capacity(hs.len());
for character in hs.chars() {
if let Some(needle) = stack.find(character) {
pattern.push(needle as u8)
} else {
stack.push_str(&character.to_string());
pattern.push((stack.len() - 1) as u8)
}
}
pattern
}

The Second Attempt

fn generate_pattern(haystack: &str) -> Vec<usize> {
let mut stack: Vec<&u8> = Vec::with_capacity(haystack.len());
let mut pattern = Vec::with_capacity(haystack.len());
for b in haystack.as_bytes() {
if let Some(n) = stack.iter().position(|&elem| elem == b) {
pattern.push(n)
} else {
stack.push(byte);
pattern.push(stack.len() - 1);
}
}
pattern
}

The Third Attempt

fn generate_pattern(haystack: &str) -> Vec<u8> {
let mut total = 0u8;
let mut stack = [0u8; 128];
let mut pattern = Vec::with_capacity(haystack.len());
for &byte in haystack.as_bytes() {
if byte as usize > 127 {
println!("Got a non-uppercase ASCII character!");
exit(1)
}
let mut needle = stack[byte as usize];
if needle == 0 {
total += 1;
stack[byte as usize] = total;
needle = total;
}
pattern.push(needle - 1)
}
pattern
}

Frequency Counting

pub fn count_frequency(patterns: &[Vec<u8>]) -> u32 {
let mut freq: HashMap<&[u8], u32> =
HashMap::with_capacity(patterns.len());
patterns
.iter()
.for_each(|pattern| *freq.entry(pattern).or_insert(0) += 1);
freq
.par_iter()
.filter(|&(_, &value)| value > 1)
.fold(|| 0, |accum, entry| accum + entry.1)
.sum()
}
  • filter its values, retaining only counts greater than 1
  • use a fold to accumulate the remaining values
  • sum the result of the fold, giving me the final count.

Results

Complexity

This image is huge
#!/usr/bin/env python
# coding: utf-8
from sys import exit
from collections import Counter

def generate_patterns(haystack):
""" Generate tuples of integers from ASCII uppercase strings """
total = 0
# we begin having seen no bytes
stack = [0] * 128
pattern = []
for char in haystack:
byte = ord(char)
if byte > 127:
print("Found a non-uppercase ASCII character!")
exit(1)
else:
needle = stack[byte]
if needle == 0:
total += 1
stack[byte] = total
needle = total
pattern.append(needle - 1)
# we need tuples because lists aren't hashable
return tuple(pattern)
if __name__ == "__main__":
with open("words.txt", 'r') as f:
cts = Counter((generate_patterns(line) for line in f))
friendly = sum(
{ptn: ct for ptn, ct in cts.items() if ct > 1}.values()
)
print("Number of friendly strings: %s" % friendly)
  • A context manager closes the file when we finish reading from it
  • We can iterate over one line at a time, generating its pattern
  • Generators mean we don’t have to worry about intermediate allocations
  • The built-in Collections library makes frequency-counting easy
  • Dict comprehensions make filtering on values easy.

--

--

--

I study cities and technology and people’s failure to solve problems in one using the other. I also make maps and, occasionally, programs.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Git Basics for QA

Aloha, Quantum World!

How To Create A Website (Full Tutorial 2020)

How to setup CI(Continuous Integration) with ReactJs

Installing Windows 95 from Diskettes Lead Me to My Career

picoCTF write up: Nice netcat…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Steph

Steph

I study cities and technology and people’s failure to solve problems in one using the other. I also make maps and, occasionally, programs.

More from Medium

Resources and “Bracket”

Rust Tales — step 0

A Beginners Guide to Learning Rust

Adding an executable target to a Rust library