Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package_update deletes resource, some unspecified tags #108

Open
t-morrison opened this issue Feb 15, 2019 · 18 comments · May be fixed by #181
Open

package_update deletes resource, some unspecified tags #108

t-morrison opened this issue Feb 15, 2019 · 18 comments · May be fixed by #181

Comments

@t-morrison
Copy link

Running package_update() deletes the resource associated with the package on my CKAN instance, along with some (not all) of the metadata tags for the package that I do not specify.

upd <- list(
    owner_org = x$owner_org,
    name = x$name,
    title = x$title,
    country = x$country,
    document_type = "A new thing",
    notes = "This is a new note added from code again"
  )

  r <- package_update(
    upd,
    x$id
  )
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readxl_1.1.0         jsonlite_1.5         httr_1.3.1           ckanr_0.1.2.9924     DBI_1.0.0            DT_0.4               shinydashboard_0.7.1 shinyjs_1.0         
 [9] googlesheets_0.3.0   shiny_1.2.0          bindrcpp_0.2.2       dplyr_0.7.8          nrgiR_0.1.0          data.table_1.10.4-3 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       dbplyr_1.3.0     pillar_1.2.2     compiler_3.5.0   cellranger_1.1.0 later_0.7.5      bindr_0.1.1      tools_3.5.0      digest_0.6.15   
[10] tibble_1.4.2     pkgconfig_2.0.1  rlang_0.3.0.1    cli_1.0.0        rstudioapi_0.8   crosstalk_1.0.0  curl_3.2         yaml_2.1.18      stringr_1.3.0   
[19] xml2_1.2.0       htmlwidgets_1.2  tidyselect_0.2.4 glue_1.2.0       R6_2.2.2         purrr_0.2.5      magrittr_1.5     promises_1.0.1   htmltools_0.3.6 
[28] assertthat_0.2.0 mime_0.5         xtable_1.8-2     httpuv_1.4.5.1   utf8_1.1.3       stringi_1.1.7    openssl_1.0.1    crayon_1.3.4    
@florianm
Copy link
Contributor

Hi @moman822, the (best) way to update a package is to first get the package data, then update fields as required, then update the package with the updated original data.

Hope this helps!

@t-morrison
Copy link
Author

@florianm can you show (or point to) a brief example of what you mean? Do you mean somehow updating the package data locally first? Thanks.

@florianm
Copy link
Contributor

florianm commented Feb 20, 2019

@moman822 here's what I'd do:

# Step 1: get the dataset details as R list
ds_id <- "my-dataset-id-md5-hash"
ds <- ckanr::package_show(ds_id)

# Step 2: update selected fields
ds$title <- "An updated title"
ds$description <- "I have updated title and description, while leaving all other fields unchanged."
# ds contains all other package data, including tags and resources

# Step 3: Update the dataset on CKAN with the locally modified list
result <- ckanr::package_update(ds, ds_id)

Does this work for you?

@sckott
Copy link
Contributor

sckott commented Apr 3, 2019

closing due to inactivity

@sckott sckott closed this as completed Apr 3, 2019
@fernandobarbalho
Copy link

fernandobarbalho commented Apr 13, 2019

Hi, I had this same issue, and I think that is actually a bug that must be fixed. As you can read in the documentation pdf file, page 42, the approach is to include in the list just the fields you want to update. Please see this text from the documentation (https://cran.r-project.org/web/packages/ckanr/ckanr.pdf):

## update just chosen things
# Make some changes
x <- list(maintainer_email = "[email protected]")
# Then update the packge
package_update(x, '585d7ea2-ded0-4fed-9b08-61f7e83a3cb2')

@sckott
Copy link
Contributor

sckott commented Apr 15, 2019

@fernandobarbalho What is a bug? you said "that", but what is "that"?

@fernandobarbalho
Copy link

Sorry, I wasn´t so clear.
The bug is that the package updated is losing informations related to the fields that are not explicited in the list. As I´ve highlighted above, according to the documentation the user should be able to list in the x argument just the chosen things he/she wants to update.

@florianm
Copy link
Contributor

florianm commented Apr 15, 2019

If I may offer my opinion, the "delete what's not included" behaviour feels logical. Updating the package with a dict implies that the dict contains the entire package metadata. The dict returned by package_show contains (for free and in one command) all package metadata and is easier to manipulate than starting from scratch with an empty dict and having to get all the structure correct (I'm not good at this, especially when under-caffeinated, so package_show saved my bacon more than once).

As an approach to prevent unintended data loss, maybe the package_list help could expand more on this topic?
e.g. something along the lines of:

package_update replaces all package metadata with the entire dict, and only the dict. Any data present in the package but not in the dict will be dropped.

To add package metadata and retain the existing package metadata:

  • get the current package metadata with package_show
  • modify and/or add to the package metadata
  • update the package with the package metadata

To replace package metadata with new data and drop any other existing package metadata:

  • create a new package metadata dict, or get the current metadata and remove items from it
  • update the package metadata

Examples:

## Update existing package metadata
# Step 1: get the dataset details as R list
ds_id <- "my-dataset-id-md5-hash"
ds <- ckanr::package_show(ds_id)

# Step 2: update selected fields
ds$title <- "An updated title"
ds$description <- "I have updated title and description, while leaving all other fields unchanged."
# ds contains all other package data, including tags and resources

# Step 3: Update the dataset on CKAN with the locally modified list
result <- ckanr::package_update(ds, ds_id)

## Replace existing package metadata
# Step 4: illustrate possible or intended data loss - what's not in the dict will be deleted in the package:
del(ds$description)
result_with_deleted_description <- ckanr::package_update(ds, ds_id)

@fernandobarbalho @moman822 if the help for package_update were to include the above clarification and example, would you feel informed enough?

@fernandobarbalho
Copy link

fernandobarbalho commented Apr 15, 2019 via email

@florianm
Copy link
Contributor

@sckott should I PR an update to the docs for package_update?

@fernandobarbalho
Copy link

@florianm , I am sorry but the code you suggested did not work to me.
The x argument of the function package_update() must be of class list, as is written in the code:

  if (class(x) != "list") {
    stop("x must be of class list", call. = FALSE)
  }

when I execute this script bellow...

ds_ckan<-ckanr::package_show("343739de-7560-47af-8a0e-d51b363ec273",url="https://apickan.tesouro.gov.br/ckan", as= "list")

ds_ckan$license_title <- "Licença Aberta para Bases de Dados (ODbL) do Open Data Commons"
ds_ckan$license_id<- "odc-odbl"
ds_ckan$license_url <- "http://www.opendefinition.org/licenses/odc-by"
 
class(ds_ckan)

... the last executed line gives me the following result

> class(ds_ckan)
[1] "ckan_package"

which implies that the condition to execute package_update is not correct, as it can be seen when I execute this following script:

ckanr::package_update( x= ds_ckan,
                       id= "343739de-7560-47af-8a0e-d51b363ec273",
                      url ="https://apickan.tesouro.gov.br/ckan",
                      key = "my_key")
Error: x must be of class list
> sessionInfo()

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
[5] LC_TIME=Portuguese_Brazil.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ckanr_0.1.2.9924 DBI_1.0.0       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        pillar_1.3.1      compiler_3.4.3    dbplyr_1.3.0      bindr_0.1.1      
 [6] prettyunits_1.0.2 remotes_2.0.2     tools_3.4.3       digest_0.6.18     pkgbuild_1.0.2   
[11] pkgload_1.0.2     tibble_2.0.1      jsonlite_1.6      memoise_1.1.0     pkgconfig_2.0.2  
[16] rlang_0.3.1       cli_1.0.1         rstudioapi_0.9.0  curl_3.3          yaml_2.2.0       
[21] bindrcpp_0.2.2    withr_2.1.2       httr_1.4.0        dplyr_0.7.8       desc_1.2.0       
[26] fs_1.2.6          devtools_2.0.1    tidyselect_0.2.5  rprojroot_1.3-2   glue_1.3.0       
[31] R6_2.3.0          processx_3.2.1    sessioninfo_1.1.1 purrr_0.3.0       callr_3.1.1      
[36] magrittr_1.5      backports_1.1.3   ps_1.3.0          usethis_1.4.0     assertthat_0.2.0 
[41] crayon_1.3.4     

@fernandobarbalho
Copy link

@florianm , good news
ds_ckan as = 'table' works nice

so, I suggest to put this in your documentation


ds_ckan<-ckanr::package_show(pack_id, url, as= "table")

@florianm
Copy link
Contributor

thanks heaps @fernandobarbalho!

@sckott
Copy link
Contributor

sckott commented Apr 17, 2019

should I PR an update to the docs for package_update

@florianm yes, please do

@florianm
Copy link
Contributor

Still in my "later" basket, sorry haven't had bandwidth yet.

@sckott
Copy link
Contributor

sckott commented Oct 14, 2020

no worries

@sharlagelfand
Copy link
Contributor

This has come up again in the context of updating resources, where everything that isn't specified is deleted. In that function, the documentation explicitly states Other metadata, such as name or description, are not updated. - definitely not the case!

@florianm, do you expect you'd have bandwidth for updating the docs on this? No worries if not, I can take a go on it. Re-opening this issue to keep my eyes on it 👀

@sharlagelfand sharlagelfand reopened this Oct 20, 2021
@florianm
Copy link
Contributor

florianm commented Oct 21, 2021

Thanks for the reminder, this has been in my "later" basket for too long!

Edit: PR sent, apologies haven't been able to run tests locally (no test server creds).

@sharlagelfand only if that's appropriate - could you DM me test server credentials on the rOpenSci slack?
Could you test-drive the examples of package_update and resource_update?
Happy to amend PR until correct and satisfactory.

florianm pushed a commit to dbca-wa/ckanr that referenced this issue Oct 21, 2021
* package_update: Update docs and examples
* resource_update: Update docs
* DESC: Bump dev version
* DESC: Bump RoxygenNote patch version
* NEWS: Add mention
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants