Text steganography techniques classified into three types which are: 1. Format-based methods that modify the formatting of the cover-text to hide data without harming the cover-text. There are three methods of this type. The first method added extra whitespace into a text to hide data. The second method is word shifting that shifted the horizontal alignments of words in cover-text by modifying distances between words to hide data.
The third method is line shifted that shifted the vertical alignments of lines in cover-text to create a hidden shape to hide data in it. Figure 1 below shows an example of each method. 2. Random and statistical generation methods that form cover-text automatically according to the statistical characteristics of language.
3. Linguistic methods that recognize the linguistic properties of the cover-text to change it to hide the data. These methods divide into two types. The first type is a syntactic method that placed some punctuation signs in appropriate places in the cover-text to hide a data. The second type is a semantic method that replaces the words by their synonyms to hide data in it.There are other methods that proposed by the researchers to hide the data in text such as feature coding that change the text features, and text steganography that specified characters in words or modifying the words spelling. Text Steganalysis Text steganalysis intended to detect hide data from the cover-text by taking advantage of the fact of text steganography is changed the statistical properties of cover-text to hide data. Text steganalysis techniques classified into three categories which are as follows:1.
Format-based techniques, there are a little work on this area. Here we present two techniques was proposed that based on the format of the cover-text for detect the hide information. The first technique perform statistical analysis of word shift text steganography using neighbor difference. The second technique used Support Vector Machine algorithm to detect whether hidden information exists or not also estimate the length of hidden information. 2. Invisible character-based techniques, here we present two techniques of this type was proposed to detect the hide information. The first technique was proposed based on the randomness. In this technique, the information hid in webpage and represented as two states that can represented in binary code string.
And the randomness of the states differs based on if the webpage carrying secret information or not. This technique used statistical features to detect if the webpage hid information or not, these features discovered when capturing the randomness through transforming the binary code string into the octal string. Information can hide into letters in tags of a webpage in ways that are invisible to the human eye with a browser.
The second technique builds higher-order statistical models using offset to detect if the information hides into the letters in tags of webpage or not. 3. Linguistics techniques detect hide information by using features that distinguish the syntax and the semantics of the cover-text. Here we present one technique that used Meta features and immune clone mechanism to detect hide information in the text. Firstly, represent texts using Meta features.
Then select appropriate features by immune clone mechanism that will create effective detectors.