New in v2.1: exclude files with globs

PhotoStructure’s directory walker was rewritten in v2.1 to add glob support.

What’s this?

PhotoStructure comes built-in with several hundred exclusion patterns to avoid walking into system and application support directories. Read more about this here.

If any of these patterns cause your files to be ignored, or you’re already using some other naming convention to exclude directories (instead of, for example, NoMedia), the new-to-v2.1 excludeGlobsOmit and excludeGlobsAdd settings should give you the flexibility to teach PhotoStructure to import just the right set of files.

Prior approaches

Versions prior to v2.1-alpha.4 used a pleasing melange of regular expressions, string matching, and subdirectory matching to detect system and application support directories, along with a neverIgnored setting which was a simple set of pathnames.

In trying to adjust the code to be easier to explain and configure, an initial attempt in alpha.2/alpha.3 was implemented using a new setting called globs, but that implementation didn’t work well with the built-in exclusion patterns and led to a couple bugs and some deal of confusion, hence this new approach.

New approach in v2.1.0-alpha.4: one set of exclusion patterns

  • PhotoStructure’s set of exclusion patterns and RegExps have all been converted to glob patterns, the excludeGlobs.

  • During import, all files and directories will be matched against all excludeGlobs patterns. Any matching files or directories will be excluded from being imported from your library.

  • These patterns can be adjusted via new excludeGlobsAdd and excludeGlobsOmit settings, which let you add and remove glob patterns from excludeGlobs.

What’s a “glob”?

Globs” are patterns that may include asterisks, like /a/*/c (which matches any sub-directory within /a that includes a file or sub-directory named c), and other “magic” characters to match desired patterns.

Magic glob characters

  • * Matches 0 or more characters in a single path portion
  • ? Matches 1 character
  • [...] Matches a range of characters, similar to a RegExp range. If the first character of the range is ! or ^ then it matches any character not in the range.
  • !(pattern|pattern|pattern) Matches anything that does not match any of the patterns provided.
  • ?(pattern|pattern|pattern) Matches zero or one occurrence of the patterns provided.
  • +(pattern|pattern|pattern) Matches one or more occurrences of the patterns provided.
  • *(a|b|c) Matches zero or more occurrences of the patterns provided
  • ** If a “globstar” is alone in a path portion, then it matches zero or more directories and subdirectories searching for matches. It does not crawl symlinked directories.

This list is courtesy of glob. FWIW, PhotoStructure uses picomatch with default options.

Notes

Always suffix directory patterns with a forward-slash

To disambiguate between files and directories, PhotoStructure applies glob patterns that end with a forward-slash to directories, and all other patterns to files.

Directories will be matched as they are visited, so **/node_modules/ will exclude all Node.js node_modules directories, regardless of depth.

Always use forward-slashes in your patterns (even on Windows)

Back-slashes are for escaping characters. Windows pathnames will have backslashes replaced with forward slashes before matching against glob patterns.

Patterns for root directories

To match a root directory on macOS or Linux, prefix the pattern with a forward-slash (like /tmp/)

To match a root directory on Windows, either include the drive letter, or use * (like C:/tmp/ or */tmp/).

Use ./photostructure info --globs to list patterns

Inspect the excludeGlobs file and directory arrays via the info tool:

$ ./photostructure info --globs
{
  excludeGlobs: {
    file: [
      { glob: '**/.*', desc: 'hidden file' },
      { glob: '**/facetile*', desc: 'face thumbnail' }
    ],
    dir: [
      { glob: '**/__MACOSX/', desc: 'macOS resource fork' },
      { glob: '**/_includes/', desc: 'code' },
      { glob: '**/.*/', desc: 'hidden dir' },
...

Add --flat and --filter arguments if you want just the array of directory globs:

$ ./photostructure info --globs --filter excludeGlobs.dir.glob --flat
[
  '**/__MACOSX/',
  '**/_includes/',
  '**/.*/',
  '**/@eaDir/',
  '**/@Recycle/',
  '**/@SynoResource/',
...

Here’s an example of removing the __MACOSX pattern:

PS_EXCLUDE_GLOBS_OMIT='**/__MACOSX/' ./photostructure info --globs --filter excludeGlobs.dir.glob  --flat
[
  '**/_includes/',
  '**/.*/',
  '**/@eaDir/',
  '**/@Recycle/',
  '**/@SynoResource/',
...

Remember that other filters still apply

Note that globs only overrides PhotoStructure’s built-in file and directory exclusion patterns. All of the other filters, including

  • requireMakeModel,
  • rejectRatingsLessThan,
  • keywordBlocklist,
  • minImageDimension,
  • minVideoDimension,
  • minVideoDurationSec,
  • maxVideoDurationSec,
  • minAssetFileSizeBytes,
  • maxAssetFileSizeBytes

(note this list is current as of v2.1, and should be expected to change in future releases)

Examples

exclude custom directories

If you always put random screenshots that you don’t want in your library in a “screenshots” directory, add this pattern to your excludeGlobsAdd:

**/screenshots/

Note that the pattern ends with a forward-slash, /: this tells PhotoStructure that this pattern is for directories, not for files.

exclude a specific file extension

If for some reason you don’t want to import any .CR3 file, add this to your excludeGlobsAdd:

**/*.cr3

include hidden directories

If you want PhotoStructure to scan directories that start with a period, add

**/.*/

to your excludeGlobsOmit setting.

I added a pattern to excludeGlobsAdd, but files that should match that glob are still in my library!

First, make sure you structured the glob correctly:

  • if you want to exclude directories, the pattern must end with a forward slash (even on Windows)
  • glob patterns always use forward-slashes, even on Windows

Second, know that PhotoStructure needs to do a sync for the affected directories before your glob will come into play. There’s a “cleanup” step after sync’ing a directory: any file from that directory hierarchy that wasn’t updated since the start of the sync will be re-examined, and, most likely, removed from the library (the original file won’t be touched: just the asset and asset file db rows, the preview images, and, if present, the transcoded video).

This cleanup only runs at the end of syncing a directory. If you add a glob but never re-sync the impacted files, this cleanup step won’t run, and your new exclusion glob won’t be in effect.

If you re-add an existing parent directory of files that you want to remove to your scan paths and resync, this should “force” the cleanup step to run, and respect your exclusion glob.

excludeGlobs for v2.1.0-alpha.4

Here’s a list of globs shipping with version 2.1.0-alpha.4. Note that these may be adjusted in future versions. Feel free to suggest more default patterns, or edits to existing patterns!

{
  "excludeGlobs": {
    "file": [
      {
        "glob": "**/.*",
        "desc": "hidden file"
      },
      {
        "glob": "**/facetile*",
        "desc": "face thumbnail"
      }
    ],
    "dir": [
      {
        "glob": "**/__MACOSX/",
        "desc": "macOS resource fork"
      },
      {
        "glob": "**/_includes/",
        "desc": "code"
      },
      {
        "glob": "**/.*/",
        "desc": "hidden dir"
      },
      {
        "glob": "**/@eaDir/",
        "desc": "Synology thumbnail"
      },
      {
        "glob": "**/@Recycle/",
        "desc": "QNAP trash"
      },
      {
        "glob": "**/@SynoResource/",
        "desc": "Synology metadata"
      },
      {
        "glob": "**/*.lrdata/",
        "desc": "LightRoom data"
      },
      {
        "glob": "**/*.photo?(s)library/(attachments|database|external|preview?(s)|resource?(s)|scopes)/",
        "desc": "Apple Photo"
      },
      {
        "glob": "**/*.sparsebundle/bands/",
        "desc": "Time Machine backup"
      },
      {
        "glob": "**/#recycle/",
        "desc": "Synology trash"
      },
      {
        "glob": "**/#snapshot/",
        "desc": "fs snapshot"
      },
      {
        "glob": "**/$Recycle.Bin/",
        "desc": "Windows trash"
      },
      {
        "glob": "**/3rdParty/",
        "desc": "code"
      },
      {
        "glob": "**/appdata/(local|locallow|roaming)/",
        "desc": "Windows default AppData"
      },
      {
        "glob": "**/Application Data/",
        "desc": "Windows application support"
      },
      {
        "glob": "**/Application Support/",
        "desc": "Windows application support"
      },
      {
        "glob": "**/Applications/",
        "desc": "macOS Applications dir"
      },
      {
        "glob": "**/arangodb/",
        "desc": "3rd party"
      },
      {
        "glob": "**/cache/",
        "desc": "Cache dir"
      },
      {
        "glob": "**/CacheClip/",
        "desc": "Movie cache dir"
      },
      {
        "glob": "**/caches/",
        "desc": "Cache dir"
      },
      {
        "glob": "**/cmake/",
        "desc": "code"
      },
      {
        "glob": "**/com.apple.TimeMachine.localsnapshots/",
        "desc": "macOS"
      },
      {
        "glob": "**/contents/(frameworks|plugins|resources|sharedsupport)/",
        "desc": "macOS application"
      },
      {
        "glob": "**/cpan/",
        "desc": "code"
      },
      {
        "glob": "**/cygwin?(64)/(bin|cygdrive|dev|etc|games|include|lib?(32|64|x64)|local|locale|man|proc|sbin|share|src|tmp|usr|var)/",
        "desc": "Cygwin"
      },
      {
        "glob": "**/data/$of/",
        "desc": "windows backup"
      },
      {
        "glob": "**/DefinitelyTyped/",
        "desc": "TypeScript type"
      },
      {
        "glob": "**/Desktop DB/",
        "desc": "macOS metadata"
      },
      {
        "glob": "**/Desktop DF/",
        "desc": "macOS metadata"
      },
      {
        "glob": "**/Desktop.ini/",
        "desc": "macOS metadata"
      },
      {
        "glob": "**/dev/(block|bsg|bus|char|cpu|disk|dri|fd|hugepages|input|lightnvm|mapper|mqueue|net|pts|shm|snd|ubuntu-vg|usb|vfio)/",
        "desc": "Linux /dev backup"
      },
      {
        "glob": "**/DisplayDriver/",
        "desc": "Windows driver"
      },
      {
        "glob": "**/DLLs/",
        "desc": "Windows library"
      },
      {
        "glob": "**/docs/(admin|content|general|generated|sql)/",
        "desc": "code"
      },
      {
        "glob": "**/dyld/",
        "desc": "macOS library"
      },
      {
        "glob": "**/ehthumbs.db/",
        "desc": "Windows thumbnail"
      },
      {
        "glob": "**/go/(bin|blog|cmd|doc|lib|misc|pkg|src|test|vt)/",
        "desc": "code"
      },
      {
        "glob": "**/i18n/",
        "desc": "i18n code"
      },
      {
        "glob": "**/ImageMagick*/",
        "desc": "ImageMagick source"
      },
      {
        "glob": "**/iMovie Cache/",
        "desc": "Apple iMovie cache"
      },
      {
        "glob": "**/Install macOS*/",
        "desc": "macOS installer"
      },
      {
        "glob": "**/Install OS X*/",
        "desc": "macOS installer"
      },
      {
        "glob": "**/iTunes/",
        "desc": "Apple iTunes"
      },
      {
        "glob": "**/iTunes Cache/",
        "desc": "Apple iTunes"
      },
      {
        "glob": "**/iTunes Media/",
        "desc": "Apple iTunes"
      },
      {
        "glob": "**/lfs/(incomplete|objects)/",
        "desc": "Git LFS object"
      },
      {
        "glob": "**/lib/",
        "desc": "code"
      },
      {
        "glob": "**/lib/(firmware|modules|systemd|udev|x86_64-linux-gnu)/",
        "desc": "Linux system package"
      },
      {
        "glob": "**/library/(accessibility|accessibilitybundles|accounts|address book plug-ins|apple|applemediaservices|application support|assetcache|assets|assetsv2|assettypedescriptors|assistant|audio|awd|bridgesupport|bundles|cachedelete|caches|cardkit|classroom|colorpickers|colors|colorsync|components|compositions|configurationprofiles|contextual menu items|coreaccessories|coreanalytics|coreimage|coremediaio|coreservices|cryptotokenkit|defaultsconfigurations|desktop pictures|developer|dictionaries|differentialprivacy|directoryservices|display|displays|distributedevaluation|documentation|driverextensions|dtds|duetactivityscheduler|extensions|fdr|featureflags|filesystems|filters|fonts|frameworks|gpubundles|graphics|hidplugins|identityservices|image capture|input methods|installersandboxes|internet plug-ins|internetaccounts|isp|itunes|java|kerberosplugins|kernelcollections|kernels|keyboard layouts|keychain|keychains|lasecureio|launchagents|launchdaemons|lexicons|linguisticdata|locationbundles|loginplugins|logs|mediastreamplugins|messages|messagetracer|modem scripts|monitorpanels|multiverseplugins|networkserviceproxy|onboardingbundles|opendirectory|openssl|osanalytics|pairedsyncservices|password server filters|pdf services|perl|preferencebundles|preferencepanes|preferences|preferencessyncbundles|printers|privateframeworks|privilegedhelpertools|python|quicklook|quicktime|receipts|recents|ruby|runningboard|sandbox|screen savers|screenreader|script editor plugins|scriptingadditions|scripts|security|services|sounds|speech|speechbase|spotlight|stageddriverextensions|stagedextensions|startupitems|syncservices|systemconfiguration|systemextensions|systemmigration|systemprofiler|tcl|templates|textencodings|textinput|trial|updates|user pictures|user template|usereventplugins|usernotifications|video|videoprocessors|webserver|widgets|xpc)/",
        "desc": "macOS Library"
      },
      {
        "glob": "**/libs/",
        "desc": "code"
      },
      {
        "glob": "**/local/(bin|cygdrive|dev|etc|games|include|lib?(32|64|x64)|local|locale|man|proc|sbin|share|src|tmp|usr|var)/",
        "desc": "FHS system"
      },
      {
        "glob": "**/lost+found/",
        "desc": "fsck recovered file"
      },
      {
        "glob": "**/macOS Install*/",
        "desc": "macOS installer"
      },
      {
        "glob": "**/MinGW*/",
        "desc": "code"
      },
      {
        "glob": "**/mnt/(cache|crash|games|lib|local|lock|log|logs|mail|run|snap|spool|tmp)/",
        "desc": "Linux /mnt"
      },
      {
        "glob": "**/msys*/",
        "desc": "code"
      },
      {
        "glob": "**/msys*/(clang*|dev|etc|mingw*|tmp|ucrt*|var)/",
        "desc": "code"
      },
      {
        "glob": "**/Network Trash/",
        "desc": "AFP trash"
      },
      {
        "glob": "**/nix/store/",
        "desc": "NixOS package"
      },
      {
        "glob": "**/node_modules/",
        "desc": "Node.js library"
      },
      {
        "glob": "**/opt/(bin|cygdrive|dev|etc|games|google|include|lib?(32|64|x64)|local|locale|man|proc|sbin|share|src|tmp|usr|var|x11)/",
        "desc": "FHS package"
      },
      {
        "glob": "**/osv/(apps|arch|compiler|external|include|java|modules|musl|tests)/",
        "desc": "code"
      },
      {
        "glob": "**/packages/(*.mpkg|*.pkg)/",
        "desc": "macOS package"
      },
      {
        "glob": "**/Perl?(64)/(bin|eg|etc|html|lib|site)/",
        "desc": "code"
      },
      {
        "glob": "**/pg/pgsql/",
        "desc": "code"
      },
      {
        "glob": "**/pkg/(acceptance|ccl|cli|cmd|server|sql|storage|ui|util|workload)/",
        "desc": "code"
      },
      {
        "glob": "**/pkgconfig/",
        "desc": "code"
      },
      {
        "glob": "**/pkgs/",
        "desc": "Python code"
      },
      {
        "glob": "**/proc/(acpi|asound|bus|driver|fs|ipmi|irq|net|scsi|self|sys|sysvipc|thread-self|tty)/",
        "desc": "Linux /proc backup"
      },
      {
        "glob": "**/Program Files?( x86)/",
        "desc": "Windows program"
      },
      {
        "glob": "**/ProgramData/",
        "desc": "Windows application support"
      },
      {
        "glob": "**/ps/(cache|config|logs|tmp)/",
        "desc": "PhotoStructure docker dir"
      },
      {
        "glob": "**/Python*/(dlls|doc?(s)|include|lib?(s)|scripts|tcl|tools)/",
        "desc": "code"
      },
      {
        "glob": "**/resources/media/face/",
        "desc": "face thumbnail"
      },
      {
        "glob": "**/Ruby*/(bin|include|lib|msys*|packages|share|ssl)/",
        "desc": "code"
      },
      {
        "glob": "**/rubygems-*/",
        "desc": "Ruby code"
      },
      {
        "glob": "**/sdk/(build-tools|emulator|extras|patcher|platform-tools|platforms|sources|tools)/",
        "desc": "code"
      },
      {
        "glob": "**/site-packages/",
        "desc": "Python code"
      },
      {
        "glob": "**/spotifycache/",
        "desc": "Spotify cache"
      },
      {
        "glob": "**/src/**/(bin|ci|dist|doc?(s)|etc|lib?(s)|main|spec?(s)|src|test?(s)|tools|util)/",
        "desc": "code"
      },
      {
        "glob": "**/SteamApps/",
        "desc": "Game files"
      },
      {
        "glob": "**/System/(applications|developer|driverkit|iossupport|library)/",
        "desc": "macOS System"
      },
      {
        "glob": "**/System Volume Information/",
        "desc": "Windows system metadata"
      },
      {
        "glob": "**/System32/",
        "desc": "Windows system"
      },
      {
        "glob": "**/temp/",
        "desc": "temporary file"
      },
      {
        "glob": "**/Temporary Items/",
        "desc": "temporary file"
      },
      {
        "glob": "**/test/fixtures/",
        "desc": "code"
      },
      {
        "glob": "**/test_suite/",
        "desc": "code"
      },
      {
        "glob": "**/testutil?(s)/",
        "desc": "code"
      },
      {
        "glob": "**/third_party/",
        "desc": "code"
      },
      {
        "glob": "**/Thumbnail?(s)/",
        "desc": "thumbnail"
      },
      {
        "glob": "**/Thumbs.db/",
        "desc": "Windows thumbnail"
      },
      {
        "glob": "**/tmp/",
        "desc": "temporary file"
      },
      {
        "glob": "**/Trash/",
        "desc": "trash dir"
      },
      {
        "glob": "**/usr/(bin|cygdrive|dev|etc|games|include|lib?(32|64|x64)|local|locale|man|proc|sbin|share|src|tmp|usr|var)/",
        "desc": "FHS system"
      },
      {
        "glob": "**/var/(cache|crash|games|lib|local|lock|log|logs|mail|run|snap|spool|tmp)/",
        "desc": "Linux /var"
      },
      {
        "glob": "**/Windows/(boot|containers|cursors|fonts|help|installer|logs|microsoft.net|servicing|softwaredistribution|system?(32)|syswow64|temp)/",
        "desc": "Windows system dir"
      },
      {
        "glob": "**/Windows10Upgrade/",
        "desc": "Windows upgrade"
      },
      {
        "glob": "**/Xcode.app/",
        "desc": "macOS code"
      },
      {
        "glob": "/bin/",
        "desc": "command binary"
      },
      {
        "glob": "/Dell/",
        "desc": "Windows driver"
      },
      {
        "glob": "/dev/",
        "desc": "device file"
      },
      {
        "glob": "/Drivers/",
        "desc": "Windows driver"
      },
      {
        "glob": "/etc/",
        "desc": "system configuration file"
      },
      {
        "glob": "/home/mrm/.config/",
        "desc": "env.PS_CONFIG_DIR"
      },
      {
        "glob": "/home/mrm/snap/",
        "desc": "Snap software"
      },
      {
        "glob": "/initrd/",
        "desc": "initial ramdisk"
      },
      {
        "glob": "/Intel/",
        "desc": "Windows driver"
      },
      {
        "glob": "/lib?(32|64|x64)/",
        "desc": "system library"
      },
      {
        "glob": "/lost+found/",
        "desc": "fsck recovered file"
      },
      {
        "glob": "/Microsoft/",
        "desc": "Windows driver"
      },
      {
        "glob": "/NVIDIA/",
        "desc": "Windows driver"
      },
      {
        "glob": "/proc/",
        "desc": "process metadata"
      },
      {
        "glob": "/sbin/",
        "desc": "system binary"
      },
      {
        "glob": "/snap/",
        "desc": "snap software"
      },
      {
        "glob": "/sys/",
        "desc": "system metadata"
      },
      {
        "glob": "/Windows/",
        "desc": "Windows system"
      },
      {
        "glob": "/Windows.old/",
        "desc": "Windows system"
      }
    ]
  }
}

See also

Update 2022-06-27

The prior globs implementation caused confusion and new import bugs. Glob patterns are now only used for exclusions.

Click the orange pencil icon on this post to view the prior approach.

Pardon my ignorance here, but could you please provide a few concrete examples? Does this need to go to the settings.toml file?

Good idea: I added one example, with links to the settings docs. If you describe what you’d like to specifically include or exclude, tell me, and I can add that as an additional example.

Your example is already very good. Additionally, one example with a specific folder/path?

Done: holler if you have any other questions!

1 Like

Ok, so I could exclude by adding the following entry?

!*X:/abc/WhatsApp/ *

Almost: I’d go with an even simpler pattern: **/WhatsApp/ will skip over any file in a WhatsApp directory or subdirectory (so X:\Photos\WhatsApp\2022\image.jpg would get excluded, for example)

UPDATE July 3, 2022: With v2.1.0-alpha.4 and later builds, PhotoStructure doesn’t require a ** suffix to be added to directories, and only does exclusion globs, so this pattern becomes **/WhatsApp/.

1 Like

Since you have QNAP’s @Recycle in the default set, you might want to add @Recently-Snapshot as well. It’s the QNAP filesystem snapshot directory.

Oh nice, thanks! This is in the next build.