Fred Hatfull
A Blog
A Blog
Apr 29th
The following post contains the notes for a talk I am going to give at Case Western Reserve University for MATH 408 – Introduction to Cryptography. An implementation for two of the first two algorithms can be found on my Github page, and I may implement the third one over the summer if time permits. The talk is based on a research paper written for the class, which can be downloaded here and explains the techniques discussed in this post in further detail.
With the rise of the digital age and the internet, it has never been more convenient to store, copy, and transmit large quantities of information. As digital information becomes more prevalent, so does being able to secure the communication of digital information. A number of cryptographic schemes are in use today that rely on computationally infeasible problems to ensure the security of the information transmitted or to verify the authenticity of a document or party. Steganography attempts to hide information, rather than securing it, by embedding the information in an otherwise innocuous host in such a way that distortion to the host is minimal and unable to be detected both perceptually and mathematically unless it is already known to exist. Watermarking is a type of steganography that attempts to mark a signal with information in a robust manner that can’t be removed or mangled if detected.
Cryptographic Schemes
Watermarking Schemes
Steganography – the practice of hiding messages in other information
cover audio – the signal into which hidden information is to be embedded
pseudo-noise (PN) sequence – a sequence of pseudo-randomly chosen 0s and 1s. The signal should exhibit properties similar to a random distribution and is therefore a type of noise.
A watermark ideally should not be able to be detected. If a watermark introduces too much distortion into the cover signal, it may be noticed by a human, a computer, or both. A watermarked signal should be perceptually and statistically indistinguishable from the cover signal to avoid suspicion.
A watermark should be present in a signal even after normal transformations have been applied to the signal. Some common transformations that can threaten a watermark include
Watermarks should also be robust against deliberate removal or corruption attacks, which attempt to make the watermark undetectable and unrecoverable from the signal.
A watermarking mechanism should provide capacity to store enough information to allow the watermark to be useful.
In order to examine the schemes presented here, it is important to understand how audio is represented in digital systems. Sound travels in waves, and digital representations of sound simply store how loud a wave is – its amplitude – as it changes over time. This is accomplished by storing the amplitude as a number, called a sample. Thousands of samples are stored every second; in fact, stereo “CD quality” audio is sampled at a rate of around 44.1 kHz, or 44,100 samples taken every second! The rate at which audio samples are taken is called the sampling rate.
Each sample is just a number, and every number can be represented as a series of bits. This is exactly how the computer sees and stores each sample, and can be manipulated directly to hide information. Each bit in the series of bits corresponds to a power of 2, depending on the place it is in. The bit corresponding to 0th power of 2, or 1, is called the least significant bit, because it’s value has the lowest impact on the overall number represented by the bitstring.

An example byte corresponding to the number 18. The rightmost bit, a 0, is the least significant bit in this representation.
Least significant bit hiding attempts to simply encode watermark information into the LSB of each sample in the host signal. Each sample’s LSB is changed to reflect the next bit in the watermark until there are no more watermark bits to encode. Variants on this algorithm exist and include performing an exclusive or (XOR) between the LSB and the watermark bit, as well as encoding the watermark repeatedly until the cover signal is exhausted.
LSB hiding is an elementary algorithm, and presents a couple advantages along with some significant weaknesses.
An example implementation of an LSB hiding scheme is provided for reference. The implementation is done using Python 2.7 and can be found on my Github page.
Echo hiding exploits the fact that the human auditory system is unable to detect very short echos (sub millisecond) in sound. By alternating sub-perceptible echos, binary information can be encoded into an audio signal.
The first step in echo hiding is to determine the times of the echos that will correspond to 0s and 1s, which we will call d0 and d1, respectively. These times should be chosen between 0.5 milliseconds, where it can be hard to detect the echo computationally, and 1 millisecond, where an acute listener could start perceiving a distortion in the signal.
Next, two derivatives of the cover signal are generated, one with the “zero” echo applied and one with the “one” echo applied. A mixer signal is also generated based on the binary representation of the watermark. The mixer signal dictates when the “zero” echo should be in effect and when the “one” echo should be in effect. This signal is applied to the cover with the zero echo and the cover with the one echo, having the effect of turning the one signal on when a one should be encoded and turning the zero signal on when a zero should be encoded. The two resulting signals are added together to produce the final watermarked signal.
Echo detection is accomplished using a computational technique called the autocepstrum. The autocepstrum is a combination of two signal processing techniques – the cepstrum, which attempts to separate the echos in a signal from the original signal, and autocorrelation, which attempts to find repeating patterns in a signal. Details about the autocepstrum are covered in the paper and are described elsewhere; for the sake of brevity, I will omit technical details about the computation here.
Computing the autocorrelation of a watermarked signal in chunks will allow an estimation of how long echos are in that chunk. If an echo is detected at d0, then it is known that a 0 was encoded. Likewise, if an echo is detected at d1, then it is known that a 1 was encoded. The string of 0s and 1s that are produced forms the recovered watermark.
An example implementation of an echo hiding scheme is provided in the Github repository for this talk. The implementation is not currently capable of completely decoding the audio deterministically due to a lack of time in preparing the presentation; however, it is provided to show that it is possible to recover an almost identical signal to the original watermark if one knows the echo times of the original encoding.
This technique, which I lovingly call SSHAM, exploits knowledge about the human auditory system to embed pseudo-noise (PN) sequences in audio. When sounds are made at a certain pitch, sounds made simultaneously at a similar pitch but lower in amplitude are masked by the human perceptual system. Likewise, as seen in echo hiding, sounds made soon after other louder impulses are also masked.
The signal is first broken down into segments and examined in the frequency domain using an operation called the Discrete Fourier Transform. This lets us examine the volume of each signal segment as it changes with frequency, rather than examining how the signal changes over time. Once the signal is in the frequency domain, we examine it and divide it into its tonal and non-tonal components.
Humans tend to classify sounds into two categories: tonal and non-tonal. Tonal sounds tend to have a narrow frequency band and have a distinct pitch to them; examples include a bird’s song, a musical instrument, or a human voice. Non-tonal sounds tend to have a wide frequency band and exhibit properties perceptually closer to noise. Examples of non-tonal sounds include the crunching of dry leaves, radio static, or the rustling of a plastic bag. Identifying the tonal and non-tonal components to a sound is important, as the human auditory system exhibits different sensitivities to tonal components than it does to non-tonal components.
Once identified, tonal components that lie close enough in frequency to other tonal components such that they are already masked are removed from the signal; likewise, components of the signal that lie below the absolute audible threshold are also removed. This leaves us with a signal which contains only the perceptually significant components in the cover audio. Using the tonal and non-tonal components of the signal, as well as empirical data about the human auditory system’s masking characteristics, a threshold describing the volume below which sounds are imperceptible from the cover signal is generated. The watermark, a PN-sequence, is then filtered with this threshold, resulting in a watermark signal in the frequency domain that will be imperceptible to human listeners.
The watermark is then shifted forward in time slightly to account for temporal masking and then added to the original cover signal.
A final step is taken to improve the robustness of the system against lossy encoding. Because lossy encoding schemes discard perceptually insignificant information in order to compress the signal, the watermark is encoded twice: once in the regions of the signal that are preserved by lossy encoders, and once again in the regions typically discarded by lossy encoders (for increased detection rates).
In order to detect the presence of the watermark in a signal, both the original cover signal and the watermark are needed. The original cover signal is subtracted from the received potentially watermarked signal. The result will either be noise resulting from encoding differences of the signals, or noise plus the watermark PN-sequence. Because PN-sequences autocorrelate very well, the result of the subtraction is autocorrelated with the watermark sequence. If the result crosses a reasonable threshold, the watermark is present in the signal.
While I was unable to implement this algorithm in time, the other two algorithms can be found at the Github repository. Additional technical details about the algorithms described here, as well as links to the original authors of the schemes, can be found in the paper, which is available for download here. This certainly is not a comprehensive survey of audio watermarking schemes. If you have a favorite that isn’t discussed here, feel free to post it in the comments!
Dec 8th
Inga Borga is a sweet, but lonely, old woman. These days she spends a lot of time reading poetry, not the least of which is authored by the esteemed poet Stanislav Slavinsky. Mr. Slavinsky is reading his poetry live on the nearby Planet Deux, a mere short hop by Space Train from Inga’s home. Inga wants nothing more to see Stanislav in person, so she embarks on the Mustachio Express to Planet Deux. Little does she know it will be a bumpy ride…
Space Train is now available for Mac and Windows! Download links are here:
Mac (55 MB)
Windows (51 MB)
Space Train is a point-and-click adventure game developed over the course of a semester by students at the Cleveland Institute of Art and Case Western Reserve University. It features a dynamic, event-driven level scripting engine that supports characters, items, inventory, dialogue, and more. Space Train is written in Python using the Pyglet graphics and audio library.
The Space Train team consists of
The project was supervised by Dr. Marc Buchner, Case Western Reserve University and Knut Hybinette, Cleveland Institute of Art.
Nov 18th
About a month ago, Case Western alumnus Nick Berendt graciously sent Toby and I to ErlangCamp. We didn’t know a thing about Erlang beyond having heard it mentioned among programming languages that never made it “main-stream.” After 48 hours of intense Erlang instruction, we emerged wiser men. If you don’t know anything about Erlang, here’s a quick rundown:
Erlang is
Toby and I gave an introductory talk on Erlang at CWRU Hacker Society last night. The slides can be found here.
Erlang: the Movie poster via Steve Vinoski.
Nov 6th
It’s that time of year again. Being a student means going through a fairly predictable job application process for summer internships every year, and this year has been no exception. This is only my second time through the procedure, but I’ve already begun to learn a thing or two about what is otherwise a fairly stressful and emotional process. What I’m about to explain may seem like common sense to the seasoned applicant, but is worth repeating because it is really the key to both finding the right job for you and landing that job. As I’m studying computer science, most of these tips are geared toward software engineering positions; however, most of the sentiments expressed here are applicable in many other fields, too.
This is especially true in computer science and related fields. Yes, you should know the fundamentals; you will never get hired if you don’t know the difference between a compiled and an interpreted language or what a binary tree is. However, what is (or should be) far more important to any potential employer is how you reason through problems as you tackle them. This is a skill that isn’t developed by reading books or blogs or attending lectures. Being able to effectively solve problems is a skill that is cultivated simply by problem solving. Most engineering disciplines, including software engineering, revolve around critical thinking and problem solving; being able to do both well is extremely important and should be sought after by any software shop worth their salt.
When you eventually get asked a question which seems virtually impossible to come up with an answer to, don’t panic. In fact, take this as a prime opportunity to show yourself off. Not only is the recruiter trying to test your limits, but he or she is also trying to gather very important information on what you do when asked to do something you may not necessarily know how to do. The best thing you can do in this situation is to say everything that goes through your head. A recruiter can’t possibly know what you’re thinking if you don’t say it. You will likely explore several possibilities that may or may not result in success and you will probably receive some coaching on the way. Finding the answer is worth bonus points, but what is more important is that the interviewer got a glimpse into your thought process.
It’s often easy to feel like the whole interview process is just a test which determines whether or not you are fit to work somewhere in particular. However, it’s important not to forget that while job interviews serve as a way for employers to find out more about an applicant, it’s also a very important means for an applicant to find out more about an employer. There’s a lot to be learned about a company during a job interview. You should be paying as much attention to the recruiter as he or she is paying to you.
Think about what kind of questions you are being asked. Often the inquiries made by a recruiter reflect a lot of information about the employer. As explained earlier, questions that are out of your reach or involve some sort of reasoning are important. If you find yourself consistently stimulated by questions asked in the interview, chances are good you will be consistently stimulated on the job, too. Likewise, if all of the questions in the interview are about keywords in a specific language (I’ve had this happen at least once!), there’s a reasonable chance you could find yourself performing similarly mindless work for that employer.
It’s important for you to be asking questions in the interview as well. When a recruiter asks “Do you have any questions?“, the answer is never “no.” This is good practice for two reasons. Firstly, it shows the employer that you have some interest in the company and that you care about where you end up working. But secondly, and perhaps more importantly, it gives you a chance to actually figure out what the company is about, and whether or not is actually somewhere you will want to be developing software. In a good interview you will probably have a long list of questions about the company, the technologies it uses, how it’s structured, and more. However, if you are unsure about the company, this is a great opportunity to find out whether or not it’s somewhere you want to end up. A good place to start is Joel Spolsky’s 12 Steps to Better Code. While that list is over 10 years old now, it provides some good starting points for probing how a software shop conducts itself.
Not every job is for everyone, and not everyone is for every job. The job application process can seem tedious, one-sided, and disheartening, especially in a climate where jobs are scarce. Every person has a unique set of talents and skills they bring to the table, and some positions are better fits than others for your particular set. It may seem trite, but you really do have to just keep trying. Eventually you will find a position that’s a good fit for both you and the employer. Until then, try to stay confident if you don’t get a callback from that killer job you were hoping for. Confidence is a huge asset in the job search process, and maintaining it is invaluable.
Oct 30th
I used to think blogging was lame.
Now I have lots of stimulating things going on and too much stuff to keep in my brain. I want to put some of it here.
Maybe it will help someone some day.
This is my blog.