Skip to content

Commit

Permalink
UPDATE
Browse files Browse the repository at this point in the history
  • Loading branch information
davidycliao committed Dec 30, 2024
1 parent 9b6d0b0 commit a3b2431
Show file tree
Hide file tree
Showing 23 changed files with 148 additions and 1,251 deletions.
154 changes: 80 additions & 74 deletions .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
@@ -1,91 +1,97 @@
name: flaiR-Docker

on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
push:
branches: [main, master]
pull_request:
branches: [main, master]

jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
strategy:
fail-fast: false
matrix:
config:
- {os: macos-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
steps:
- uses: actions/checkout@v3
R-CMD-check:
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
strategy:
fail-fast: false
matrix:
config:
- {os: macos-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
steps:
- uses: actions/checkout@v3

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true
- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true

- uses: r-lib/actions/setup-pandoc@v2
- uses: r-lib/actions/setup-pandoc@v2

- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.9'
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.9'

- name: Check Python Version
run: python --version
- name: Check Python Version
run: python --version

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install flair
- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install flair
- name: Install R dependencies
run: |
install.packages('remotes')
remotes::install_github("davidycliao/flaiR", force = TRUE)
shell: Rscript {0}
- name: Install R dependencies
run: |
install.packages('remotes')
remotes::install_github("davidycliao/flaiR", force = TRUE)
shell: Rscript {0}

- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: rcmdcheck
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: rcmdcheck

docker:
needs: R-CMD-check
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
docker:
needs: R-CMD-check
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v3

steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Log in to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Log in to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
ghcr.io/${{ github.repository_owner }}/flair:latest
ghcr.io/${{ github.repository_owner }}/flair:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
ghcr.io/${{ github.repository_owner }}/flair:latest
ghcr.io/${{ github.repository_owner }}/flair:${{ github.sha }}
labels: |
org.opencontainers.image.title=flaiR
org.opencontainers.image.description=An R Docker Image for Natural Language Processing with Flair. Includes R, Python with Flair NLP library, and text analysis dependencies.
org.opencontainers.image.vendor=davidycliao
org.opencontainers.image.version=0.0.7
annotations: |
org.opencontainers.image.description=An R Docker Image for Natural Language Processing with Flair. This image includes R, Python with Flair NLP library, and essential dependencies for text analysis. Supports both standard NER and OntoNotes models, with batch processing capabilities.
cache-from: type=gha
cache-to: type=gha,mode=max
40 changes: 24 additions & 16 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Authors@R: c(
Description: Provides a suite of intuitive and flexible wrapper tools tailored
for accessing FLAIR in Python data analysis. The package is an R wrapper
designed to provide access to the primary features of FLAIR.
Maintainer: David Liao <[email protected]>
Maintainer: Yen-Chieh Liao <[email protected]>
SystemRequirements: Python (>= 3.8.0) flair (>= 0.12)
Depends:
R (>= 3.6)
Expand All @@ -22,22 +22,30 @@ LazyData: true
URL: https://davidycliao.github.io/flaiR
BugReports: https://github.com/davidycliao/flaiR/issues
Imports:
data.table,
reticulate,
curl,
attempt,
htmltools,
stringr
data.table,
reticulate,
curl,
attempt,
htmltools,
stringr,
stats,
utils,
tibble
Suggests:
knitr,
renv,
rmarkdown,
lsa,
purrr,
jsonlite,
ggplot2,
plotly,
testthat (>= 3.0.0)
knitr,
renv,
rmarkdown,
lsa,
purrr,
jsonlite,
ggplot2,
plotly,
testthat (>= 3.0.0),
conText,
dplyr,
ggpubr,
quanteda,
text2vec
RoxygenNote: 7.3.2
VignetteBuilder: knitr, rmarkdown
Roxygen: list(markdown = TRUE)
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM r-base:latest
LABEL maintainer="Yen-Chieh Liao <[email protected]>"
LABEL org.opencontainers.image.description="flaiR: An R Docker Image for Natural Language Processing with Flair. This image includes R, Python with Flair NLP library, and essential dependencies for text analysis. Supports both standard NER and OntoNotes models, with batch processing capabilities."


# 安裝系統依賴
RUN apt-get update && apt-get install -y \
python3 \
python3-pip \
Expand Down
2 changes: 1 addition & 1 deletion R/flair.R
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
#'}
#' @return An object that represents the Flair module from Python.
#'
#' @details This function relies on the `reticulate` package to import and
#' @details This function relies on the reticulate package to import and
#' use the Flair module from Python. Ensure you have the Flair Python library
#' installed in the Python environment being used.
#'
Expand Down
2 changes: 1 addition & 1 deletion R/flair_datasets.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' @title Access the flair_datasets Module from Flair
#'
#' @description Utilizes the {reticulate} package to import the `flair.datasets`
#' @description Utilizes the reticulate package to import the `flair.datasets`
#' dataset from Flair's datasets in Python, enabling the use of this dataset in
#' an R environment.
#'
Expand Down
6 changes: 3 additions & 3 deletions R/flair_embeddings.R
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ flair_embeddings.FlairEmbeddings <- function(embeddings_type = "news-forward") {
#' @title Initializing a Class for Flair WordEmbeddings Class
#'
#' @description
#' This function interfaces with Python via {reticulate} to create a `WordEmbeddings`
#' This function interfaces with Python via reticulate to create a `WordEmbeddings`
#' object using the Flair library. Users select which pre-trained embeddings to load
#' by providing the appropriate ID string. Typically, a two-letter language code initializes
#' an embedding (e.g., 'en' for English, 'de' for German). By default, this loads FastText embeddings
Expand Down Expand Up @@ -258,7 +258,7 @@ flair_embeddings.WordEmbeddings <- function(embeddings = "glove") {

#' @title Initializing a Class for TransformerDocumentEmbeddings
#'
#' @description This function interfaces with Python via {reticulate} to
#' @description This function interfaces with Python via reticulate to
#' create a `flair_embeddings.TransformerDocumentEmbeddings` object from
#' the flair.embeddings module.
#'
Expand Down Expand Up @@ -342,7 +342,7 @@ flair_embeddings.TransformerDocumentEmbeddings <- function(model = "bert-base-un

#' @title Initializing a Class for TransformerWordEmbeddings
#'
#' @description This function interfaces with Python via {reticulate} to create
#' @description This function interfaces with Python via reticulate to create
#' a `TransformerWordEmbeddings` object object from the flair.embeddings module.
#'
#' @param model A character string specifying the pre-trained model to use.
Expand Down
51 changes: 0 additions & 51 deletions R/flair_loaders.R
Original file line number Diff line number Diff line change
Expand Up @@ -160,57 +160,6 @@ get_tagger_tags <- function(tagger) {
}


#' @title Load Flair POS Tagger
#'
#' @description This function loads the POS (part-of-speech) tagger model for a specified language
#' using the Flair library. If no language is specified, it defaults to 'pos-fast'.
#'
#' @param language A character string indicating the desired language model. If `NULL`,
#' the function will default to the 'pos-fast' model. Supported language models include:
#' \itemize{
#' \item "pos" - General POS tagging
#' \item "pos-fast" - Faster POS tagging
#' \item "upos" - Universal POS tagging
#' \item "upos-fast" - Faster Universal POS tagging
#' \item "pos-multi" - Multi-language POS tagging
#' \item "pos-multi-fast" - Faster Multi-language POS tagging
#' \item "ar-pos" - Arabic POS tagging
#' \item "de-pos" - German POS tagging
#' \item "de-pos-tweets" - German POS tagging for tweets
#' \item "da-pos" - Danish POS tagging
#' \item "ml-pos" - Malayalam POS tagging
#' \item "ml-upos" - Malayalam Universal POS tagging
#' \item "pt-pos-clinical" - Clinical Portuguese POS tagging
#' \item "pos-ukrainian" - Ukrainian POS tagging
#' }
#' @return A Flair POS tagger model corresponding to the specified (or default) language.
#'
#' @importFrom reticulate import
#' @export
#' @examples
#' \dontrun{
#' tagger <- load_tagger_pos("pos-fast")
#' }
# load_tagger_pos <- function(language = NULL) {
# supported_lan_models <- c("pos", "pos-fast", "upos", "upos-fast",
# "pos-multi", "pos-multi-fast", "ar-pos", "de-pos",
# "de-pos-tweets", "da-pos", "ml-pos",
# "ml-upos", "pt-pos-clinical", "pos-ukrainian")
#
# if (is.null(language)) {
# language <- "pos-fast"
# message("Language is not specified. ", language, "in Flair is forceloaded. Please ensure that the internet connectivity is stable. \n")
# }
#
# # Ensure the model is supported
# check_language_supported(language = language, supported_lan_models = supported_lan_models)
#
# # Load the model
# flair <- reticulate::import("flair")
# Classifier <- flair$nn$Classifier
# tagger <- Classifier$load(language)
# }

#' @title Load POS (Part-of-Speech) Tagger Model
#'
#' @description Loads a Part-of-Speech tagging model from Flair and displays
Expand Down
2 changes: 1 addition & 1 deletion R/flair_models.R
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ flair_models.TextClassifier <- function() {
#' @title Access Flair's SequenceTagger
#'
#' @description
#' This function utilizes the {reticulate} package to import the `SequenceTagger`s
#' This function utilizes the reticulate package to import the `SequenceTagger`s
#' from Flair's models in Python, enabling interaction with Flair's sequence
#' tagging models in an R environment.
#'
Expand Down
2 changes: 1 addition & 1 deletion R/flair_trainers.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' @title Import flair.trainers Module in R
#'
#' @description This flair_trainers() provides R users with access to Flair's
#' ModelTrainer Python class using the {reticulate} package. The `ModelTrainer`
#' ModelTrainer Python class using the reticulate package. The `ModelTrainer`
#' class offers the following main methods:
#' \itemize{
#' \item **train**: Trains a given model. Parameters include the corpus
Expand Down
Loading

0 comments on commit a3b2431

Please sign in to comment.