AL
A.N. Lantink
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Binarization of Historical Watermarks
A Review of Thresholding Techniques Applied to Historical Watermark Images
A watermark image is a scan of a historical paper document that contains a watermark, which is a motif embedded in the paper that provides valuable information on the origins of a document. Developing tools to automatically identify watermarks can make this information more accessible to researchers. This paper focuses on one specific binarization technique, thresholding. Thresholding selects a threshold value, which is used to turn an image binary such that one color represents foreground and the other represents background. Ideally, binarization isolates the watermark’s shape by representing it as foreground, and removes unwanted information. This research compares the effectiveness of different thresholding techniques when applied to watermark images. Eight algorithms are selected from the literature, and a novel algorithm is proposed that seeks to improve on the other algorithms when applied to watermarks. The nine total algorithms are evaluated quantitatively on synthetic data, and qualitatively through a survey where participants select which algorithm appears best and rate it. The results show that there is no clear algorithm which works best for all images, however a logical adaptive approach may work marginally better than other approaches. Additionally, the presented algorithms do not adequately remove non-watermark information from the images. Further research should be conducted to analyze different binarization techniques in this context.
...
A watermark image is a scan of a historical paper document that contains a watermark, which is a motif embedded in the paper that provides valuable information on the origins of a document. Developing tools to automatically identify watermarks can make this information more accessible to researchers. This paper focuses on one specific binarization technique, thresholding. Thresholding selects a threshold value, which is used to turn an image binary such that one color represents foreground and the other represents background. Ideally, binarization isolates the watermark’s shape by representing it as foreground, and removes unwanted information. This research compares the effectiveness of different thresholding techniques when applied to watermark images. Eight algorithms are selected from the literature, and a novel algorithm is proposed that seeks to improve on the other algorithms when applied to watermarks. The nine total algorithms are evaluated quantitatively on synthetic data, and qualitatively through a survey where participants select which algorithm appears best and rate it. The results show that there is no clear algorithm which works best for all images, however a logical adaptive approach may work marginally better than other approaches. Additionally, the presented algorithms do not adequately remove non-watermark information from the images. Further research should be conducted to analyze different binarization techniques in this context.
Watermarks are historical motifs present in the texture of paper that are commonly used to identify the paper manufacturers. They only become visible when viewed under certain light conditions. Under ideal circumstances, researchers may use watermarks to determine a historical document’s origins and context. To identify a watermark, it is matched to a previously archived watermark. Currently, this matching must be done manually, which is neither scalable nor parallelizable. Existing studies explore digital reconstructions of watermarks, but do not focus on a comparison-based setup. This report discusses a system that can automatically identify similar watermarks using traditional image processing techniques. The resulting system speeds up the process considerably, can be used on small datasets, and is more accessible to end-users.
The system uses harmonization, feature extraction, and similarity matching. Harmonization involves improving the clarity of the watermark, which is often obscured by the material properties of the paper. Feature extraction involves finding useful information from the isolated watermarks, and similarity matching uses this information to score the similarity of a pair.
We evaluated our system based on a dataset provided by the German Museum of Books and Writing. Over a broader range of quality, accuracy was found to be within the range of 41-53%. It was also found that improving watermark quality within the dataset improved accuracy results to around 82%. The system shows promise particularly with higher quality datasets. This report therefore demonstrates that traditional image processing techniques can be valuable when applied to situations where artificial intelligence may not be possible or efficient. Further research into this domain would be required to understand the advantages and limitations of image processing in comparison with artificial intelligence.
...
The system uses harmonization, feature extraction, and similarity matching. Harmonization involves improving the clarity of the watermark, which is often obscured by the material properties of the paper. Feature extraction involves finding useful information from the isolated watermarks, and similarity matching uses this information to score the similarity of a pair.
We evaluated our system based on a dataset provided by the German Museum of Books and Writing. Over a broader range of quality, accuracy was found to be within the range of 41-53%. It was also found that improving watermark quality within the dataset improved accuracy results to around 82%. The system shows promise particularly with higher quality datasets. This report therefore demonstrates that traditional image processing techniques can be valuable when applied to situations where artificial intelligence may not be possible or efficient. Further research into this domain would be required to understand the advantages and limitations of image processing in comparison with artificial intelligence.
...
Watermarks are historical motifs present in the texture of paper that are commonly used to identify the paper manufacturers. They only become visible when viewed under certain light conditions. Under ideal circumstances, researchers may use watermarks to determine a historical document’s origins and context. To identify a watermark, it is matched to a previously archived watermark. Currently, this matching must be done manually, which is neither scalable nor parallelizable. Existing studies explore digital reconstructions of watermarks, but do not focus on a comparison-based setup. This report discusses a system that can automatically identify similar watermarks using traditional image processing techniques. The resulting system speeds up the process considerably, can be used on small datasets, and is more accessible to end-users.
The system uses harmonization, feature extraction, and similarity matching. Harmonization involves improving the clarity of the watermark, which is often obscured by the material properties of the paper. Feature extraction involves finding useful information from the isolated watermarks, and similarity matching uses this information to score the similarity of a pair.
We evaluated our system based on a dataset provided by the German Museum of Books and Writing. Over a broader range of quality, accuracy was found to be within the range of 41-53%. It was also found that improving watermark quality within the dataset improved accuracy results to around 82%. The system shows promise particularly with higher quality datasets. This report therefore demonstrates that traditional image processing techniques can be valuable when applied to situations where artificial intelligence may not be possible or efficient. Further research into this domain would be required to understand the advantages and limitations of image processing in comparison with artificial intelligence.
The system uses harmonization, feature extraction, and similarity matching. Harmonization involves improving the clarity of the watermark, which is often obscured by the material properties of the paper. Feature extraction involves finding useful information from the isolated watermarks, and similarity matching uses this information to score the similarity of a pair.
We evaluated our system based on a dataset provided by the German Museum of Books and Writing. Over a broader range of quality, accuracy was found to be within the range of 41-53%. It was also found that improving watermark quality within the dataset improved accuracy results to around 82%. The system shows promise particularly with higher quality datasets. This report therefore demonstrates that traditional image processing techniques can be valuable when applied to situations where artificial intelligence may not be possible or efficient. Further research into this domain would be required to understand the advantages and limitations of image processing in comparison with artificial intelligence.