Language support #4

lubitchv · 2024-10-24T16:19:51Z

Language support was added. The code is checking xml:lang (Metadatalanguage) property in xml codebook and if it is present, it selects the language for pdf codebook based on it. If inc language file does not exists than it produces English pdf codebook.
If property xml:lang does not exist then code tries to guess the language of dataset based on Title or Description. It uses Apache Tika language detection. If it detects language with high probability then this language is used for generating pdf codebook, in all other cases English is used.

qqmyers

Looks good. I had a couple questions and an offline request to bump the version number.

qqmyers · 2024-10-24T16:43:46Z

src/main/java/io/gdcc/export/ddipdf/FileResolver.java

-        String url =href.substring("file:".length()); // some calculation from its parameters
-                InputStream is = this.getClass().getResourceAsStream(url);
-                return new StreamSource(is);
+            int index = href.lastIndexOf("/");


Why remove the path? I guess it works now because the java package path and the resources path are the same?

Yes, and also It did not work with the path. In any case "ddi-to-fo.xsl" also does not use path.

Weird - I thought I had the i18n part (without the tika test) working at one point.

qqmyers · 2024-10-24T16:45:59Z

pom.xml

+            <groupId>org.apache.tika</groupId>
+            <artifactId>tika-langdetect-optimaize</artifactId>
+            <version>2.9.2</version>
+        </dependency>


Have you looked at issue #3 - is fop still not included? Are the tika classes? (If so - then we can close that issue when this merges).

fop is included. I did not have problems with it.

Got it - guessing it's a build command problem - I made a note in the issue #3.

lubitchv · 2024-10-24T17:31:14Z

I changed version to 1.1.0

pdurbin · 2024-10-24T17:59:46Z

Can we publish it to Maven Central? And update the TBD at https://github.com/gdcc/dataverse-exporters ?

qqmyers · 2024-10-24T21:41:24Z

I was able to test this in English and it worked fine, so I'll go ahead and merge.

Language support

ac8aaf9

qqmyers requested changes Oct 24, 2024

View reviewed changes

version

ec6374d

qqmyers mentioned this pull request Oct 24, 2024

Export fails with NoClassDefFoundError: org/apache/fop/apps/FopFactory #3

Open

qqmyers merged commit 48cf697 into gdcc:main Oct 24, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language support #4

Language support #4

lubitchv commented Oct 24, 2024

qqmyers left a comment

qqmyers Oct 24, 2024

lubitchv Oct 24, 2024

qqmyers Oct 24, 2024

qqmyers Oct 24, 2024

lubitchv Oct 24, 2024

qqmyers Oct 24, 2024

lubitchv commented Oct 24, 2024

pdurbin commented Oct 24, 2024

qqmyers commented Oct 24, 2024

Language support #4

Language support #4

Conversation

lubitchv commented Oct 24, 2024

qqmyers left a comment

Choose a reason for hiding this comment

qqmyers Oct 24, 2024

Choose a reason for hiding this comment

lubitchv Oct 24, 2024

Choose a reason for hiding this comment

qqmyers Oct 24, 2024

Choose a reason for hiding this comment

qqmyers Oct 24, 2024

Choose a reason for hiding this comment

lubitchv Oct 24, 2024

Choose a reason for hiding this comment

qqmyers Oct 24, 2024

Choose a reason for hiding this comment

lubitchv commented Oct 24, 2024

pdurbin commented Oct 24, 2024

qqmyers commented Oct 24, 2024