Openrefine - wtkns.com1. install
2. start openrefine
3. Create project
4. Publisher > Text Facet
1. (457 results)
2. view by count
3. Fantagraphics Books (99)(21)(21)
4. combine 3x ()
5. view by name
6. still 4 different fantagraphics
5. Cluster (metaphone3)
1. replace: Turnaround;
2. replace: Diamond distributor;
3. replace: Turnaround distributor;
6. replace "xxx by "
1. regex101.com
2. export facets (439 choices)
3. .+ by
4. replace text in column
7. cluster again
8. SUBJECTS
1. to titlecase
2. split multivalued cells (;)
3. split multivalued cells (/)
4. split multivalued cells (--)
5. titlecase again
6. delete whitespace
7. replace ", etc."
7. replace "."
8. replace "^-"
9. replace "(^\("
10. replace "\)
quot;
9. show as records
10. rejoin with ;
11. AUTHOR
1. split column ", "
2. transform > titlecase
3. join columns author 1,2 ", "
4. Cluster
12. PUBLICATION DATE
1. ^c|©
2. ^\[|\].*$|\-.*$
regex101.com
\b(author|illustrator|artist)\b
\D*
second comma:
^([^,]*,[^,]*),.