More settings to control imported keywords

According to PhotoStructure | How does PhotoStructure extract keywords from my photos and videos? there is a list of EXIF tags that are being imported

  • CatalogSets
  • Categories (this is typically XML-encoded)
  • HierarchicalSubject
  • Keywords
  • LastKeywordXMP
  • Subject
  • TagsList
  • XPKeywords (these are keywords added by the Windows Explorer)

I would like to be able to control which tags PhotoStructure imports. Actually I need only HierarchicalSubject as it is the most accurate piece of data. I use Adobe Lightroom Classic to organize my assets library and it sets HierarchicalSubject as a primary source and makes flat copies into Subject and Keywords tags. However IPTC:Keywords has limitation for 64 characters IPTC Tags

So if you have HierarchicalSubject = a | b | c | Xd, where X is a 64 characters string, then you will have flat structure Subject = a, b, c, Xd and Keywords = a, b, c, X (note that d is trimmed from Xd word, because Xd is longer than allowed 64 characters)

I guess, that PhotoStucture tries to deduce hierarchical tags from the flat structure, so when it sees Xd in the Subject tag it realizes it is a part of the hierarchical structure kw:a/b/c/Xd

But when it sees trimmed tag X from Keywords tag it cannot deduce its place from the hierarchical structure and therefore it creates kw:X

Then I see kw:X in the list of the PhotoStructure tags and it annoys me, as I want to see only my well-structured kw:a/b/c/Xd

So if I would have a way to configure PhotoStructure to stop parsing Keywords tags, it would solve this annoying issue I have

1 Like

This is 3 lines of code. It’ll be done in alpha.7 :+1:

# +---------------+
# |  keywordTags  |
# +---------------+
#
# PhotoStructure should look in the following tags for keywords. Note that
# these values are case-sensitive.
# (env: "PS_KEYWORD_TAGS")
#
keywordTags = [
  "CatalogSets",
  "Categories",
  "HierarchicalSubject",
  "Keywords",
  "LastKeywordXMP",
  "Subject",
  "TagsList",
  "XPKeywords"
]
2 Likes

If you can test with alpha.7 and verify that this addresses this concern, that’d be great!

1 Like

(Oops: I thought closing a topic was just cosmetic: I didn’t realize it prevented subsequent replies!)

1 Like

The characters that are used to see if a given keyword is actually hierarchical is via the keywordPathSeparators setting:

# +-------------------------+
# |  keywordPathSeparators  |
# +-------------------------+
#
# PhotoStructure interprets keywords as hierarchical if a path separator
# character is found in a keyword. This allows for tags like
# "Family/Einstein/Albert", "Flora|Fruit|Orange", "Objects⊃Tools⊃Hammer", or
# "Fauna>Oceanic>Pelican". By default, these separators are the forward-slash,
# vertical-bar, and greater-than characters. If you don't want to interpret
# keywords as hierarchical, change this value to an empty string (""). After
# changing this value, you must force-resync your entire library for the
# changes to take affect.
# (env: "PS_KEYWORD_PATH_SEPARATORS")
#
keywordPathSeparators = "/|>⊃"

Yes, it addressed my concerns, thanks. I modified the setting to import only HierarchicalSubject but then I found some issues with my data.

Some of my keywords had commas in them. It’s impossible to set such keywords via Adobe Lightroom Classic UI, so it seems I set them via exiftool during some cleanup process. I am going to fix those commas in my library to stop confusing other parsers that split keywords with commas into multiple keywords.

Also I noticed that PhotoStructure adds unnecessary hierarchical tag if keyword has / . I am going to fix those keywords in my library as well

I’ve fixed my tags and executed ./photostructure sync --force --exit-when-done but still invalid tags are present in the library. Is there a way to sync keywords? Isn’t that should be default behavior of sync?

To be more specific, I edited my assets’ keywords and I expect my changes will be picked up by PhotoStructure's sync process

The use of --force should have rebuilt tags. I’ll try to reproduce this tomorrow.

Thanks, as always, for reporting this!

1 Like