Make Documents Look Like Scanned - Python Project Documentation

Overview

Look-Like-Scanned is a Python based CLI tool that lets you make digital documents look like they were scanned. The package is designed to run locally with a focus on privacy, security, open-source collaboration, and transparency.

Features

Unlimited Usage: No limits, fees, or ads. Free and Open-source.
Powerful CLI: Use command-line options to convert PDFs to scanned/grayscale PDFs, combine multiple images into a PDF file, etc.
Local Processing: All operations are performed locally on your computer. No need to upload your files to untrustworthy websites.
Environmental Focus: Saves trees by eliminating the need to print and scan paper.

Functionality

The script processes each page of a PDF or multiple images, converting them into new images with slight random adjustments for skew and brightness to mimic the appearance of scanned documents. The processed pages are combined into an output PDF file and saved in the same folder. The original files remain unchanged, and the output files are named with the format {filename}_output.pdf for easy identification while maintaining the original page/image order .

Installation

Install from the Python Package Index (PyPI) by running this command

pip install look-like-scanned

📋

Or to build the latest version from GitHub

# Clone repo from GitHub and install
git clone https://github.com/navchandar/look-like-scanned.git
cd look-like-scanned
pip install poetry
poetry install
pip install .

📋

Verify Installation

# print help message and usage options available and verify the installation
scanner -h
# Scanner version: 1.0.0

📋

Arguments

These are the command-line arguments accepted:

-i, --input_folder - Specifies the input folder to read files from and convert. Default value is the current directory.
-f, --file_type_or_name - Specifies the file types to process or the file name to convert. Default value is "pdf" to convert all PDF files in the given input folder.
-q, --file_quality - Specifies the quality of the converted output files. The value must be between 50 and 100. Default value is 95.
-a, --askew - Controls whether to make the output documents slightly askew or slightly tilted. Accepted values are "yes" or "no". Default value is "yes".
-b, --black_and_white - Controls whether to save output documents in black and white format (to make it look like a photocopy). Accepted values are "yes" or "no". Default value is "no".
-c, --contrast - Controls contrast of the image. A factor of 0.0 gives a solid gray image. A factor of 1.0 gives the original image. Greater values increase the contrast of the image. Default value is 1.
-sh, --sharpness - Controls sharpness of the image. A factor of 0.0 gives a blurred image. A factor of 1.0 gives the original image. Greater values increase the sharpness of the image. Default value is 1.
-br, --brightness - Controls brightness of the image. A factor of 0.0 gives a black image. A factor of 1.0 gives the original image. Greater values increase the brightness of the image. Default value is 1.
-l, --blur - Controls whether to make the output a little bit blurry. Accepted values are "yes" or "no". Default value is "no".
-r, --recurse - Allows scripts to find all matching files including subdirectories. Accepted values are "yes" or "no". Default value is "yes".
-s, --sort_by - Alllows scripts to sort the files based on name, creation time or modified time. Accepted values are "name", "ctime", "mtime", "none". Default value is "name". If "none" is selected, then Default order of files returned by the OS is used for document conversion.

Usage Examples

This package uses PIL and pypdfium2 to convert and manipulate image and PDF objects.

This is extended to provide a command-line interface (CLI) for easy usage.

# Convert all pdf files in given folder to scanned pdf
scanner -i .\tests

📋

# Convert all pdf files in folder to scanned without askew
scanner -i .\tests -a no

📋

# Convert specific pdf file in folder to scanned pdf
scanner -i .\tests -f "test.pdf"

📋

# Convert all jpg, jpeg, png, webp files in folder to one pdf file
scanner -i .\tests -f "image"

📋

# Convert all png files in folder to pdf with 100% quality to one pdf file
scanner -i .\tests -f "png" -q 100

📋

# Convert specific jpg file in folder to pdf with 75% quality to one pdf file
scanner -i .\tests -f "JPG_Test.jpg" -q 75

📋

# Convert all pdf files including sub folders
scanner -i .\tests -f "pdf" -r yes

📋

# Convert all Images including sub folders into one pdf
scanner -i .\tests -f "image" -r yes

📋

# Convert all image files in folder in the order of file names
scanner -i .\tests -f "image" -s "name"

📋

# Convert all pdf files including sub folders and save in black & white format
scanner -i .\tests -f "pdf" -r yes -b yes

📋

# Convert all png files including sub folders and make it a little blurry
scanner -i .\tests -f "png" -r yes -b yes -l yes

📋

# Convert all pdf files in folder to scanned pdf, set contrast, sharpness and brightness factors
scanner -i .\tests -c 2 -sh 10 -br 2

📋

# Convert all pdf files in folder to scanned pdf with high contrast and sharpness
scanner -i .\tests -f "pdf" -c 3 -sh 5

📋

# Convert all pdf files in folder to scanned pdf, including subfolders, with high contrast and brightness
scanner -i .\tests -f "pdf" -r yes -c 3 -br 3

📋

# Convert all image files in folder to pdf, sorted by creation time
scanner -i .\tests -f "image" -s "ctime"

📋

# Convert all image files in folder to pdf, sorted by modified time, with slight blur
scanner -i .\tests -f "image" -s "mtime" -l yes

📋

# Convert all jpg files in folder to pdf with 80% quality and slightly askew
scanner -i .\tests -f "jpg" -q 80 -a yes

📋

# Convert all png files in folder to pdf with high sharpness and low contrast
scanner -i .\tests -f "png" -sh 5 -c 0.5

📋

❗❗ Note: ❗❗

The supported file types are: ".jpg", ".png", ".jpeg", ".webp", ".pdf".
The output PDF file size will be bigger than the input file because the pages are stored in image format.
Bookmarks / Links / Metadata will be removed when saving the output file.
Transparency will be removed from PNG files when converting to PDF.
Password protected PDF files are not yet supported.

Authors

License:

OSI Approved - MIT License

Contact:

For Feedback / Issues / Bugs

Maintained By

Naveenchandar