applescript find duplicate images and delete small images

applescript find duplicate images and delete small images


Table of Contents

applescript find duplicate images and delete small images

Applescript to Find and Delete Duplicate Small Images

Finding and deleting duplicate images, especially smaller versions of larger originals, can significantly free up disk space on your Mac. This Applescript provides a powerful solution, identifying duplicates based on image content and deleting smaller files. However, always back up your data before running any script that modifies files. This script is provided as-is and using it is at your own risk.

Understanding the Script's Logic

The script works in several stages:

  1. Image Gathering: It gathers all image files within a specified folder (you'll need to adjust this).
  2. Duplicate Detection: It compares images using their hashes (a unique fingerprint of the image data). Identical hashes indicate identical images.
  3. Size Comparison: For each duplicate set, it identifies the smallest image file.
  4. Deletion: It deletes the smallest image file(s) from the specified folder. It presents a confirmation dialog before deleting each file.

The Applescript Code

-- Set the folder containing the images.  CHANGE THIS TO YOUR FOLDER!
set theFolder to (choose folder) as alias

-- Get all image files within the folder
tell application "Finder"
	set imageFiles to every file of theFolder whose name extension is in {"jpg", "jpeg", "png", "gif", "tiff"}
end tell

-- Function to calculate image hash (using MD5 for simplicity)
on getHash(theFile)
	set theData to (read theFile as alias)
	set theHash to do shell script "md5 " & quoted form of POSIX path of theFile
	return theHash
end getHash

-- Dictionary to store image hashes and file paths
set imageHashes to {}

-- Process each image file
repeat with aFile in imageFiles
	set theHash to getHash(aFile)
	set theSize to (size of aFile) as integer
	if theHash is not in imageHashes then
		set imageHashes's theHash to {path:aFile, size:theSize}
	else
		-- Duplicate found, compare sizes
		set existingEntry to imageHashes's theHash
		if theSize < (existingEntry's size) then
			-- This is a smaller duplicate; delete it after confirmation
			display dialog "Delete smaller duplicate image: " & (name of aFile) & "?" buttons {"Yes", "No"} default button "No"
			if button returned of result is "Yes" then
				try
					delete aFile
				on error errmsg number errornum
					display dialog "Error deleting file: " & errmsg buttons {"OK"} default button "OK"
				end try
			end if
		else
			-- This is a larger duplicate; keep it and replace the existing entry with this one
			set imageHashes's theHash to {path:aFile, size:theSize}
		end if
	end if
end repeat

display dialog "Script completed. Check the folder for deleted files." buttons {"OK"} default button "OK"

How to Use the Script

  1. Save the code: Copy the Applescript code above and save it as a .applescript file (e.g., deleteDuplicateImages.applescript).
  2. Change the folder path: Crucially, modify the set theFolder line to point to the actual folder containing your images. You can either hardcode the path (e.g., set theFolder to POSIX path of "/Users/yourusername/Pictures/MyImages") or use the choose folder command (as shown), which will open a file dialog for you to select the folder.
  3. Run the script: Double-click the saved .applescript file to run it. The script will prompt you to confirm the deletion of each smaller duplicate image.

Important Considerations

  • Error Handling: The script includes basic error handling for file deletion, but unforeseen issues might occur. Always back up your data first.
  • Image Formats: The script currently supports JPG, JPEG, PNG, GIF, and TIFF. You can easily extend it to include other formats by adding them to the list in the whose name extension is in line.
  • Hashing Algorithm: MD5 is used for simplicity, but more robust hashing algorithms (like SHA-256) might be considered for increased accuracy (though they will be slower).
  • Large Datasets: For very large numbers of images, this script may take a considerable amount of time to run. Consider optimizing for performance if you're dealing with thousands of files.
  • File Integrity: Ensure your backups are valid. This script will permanently remove files; recovery might be impossible.

Remember: This script is a tool to assist in managing your images. Use caution, and always back up your important files before running any script that modifies them.