Command-Line Interface¶
To see all commands available, run:
oc4idskit --help
Users on Windows should run set PYTHONIOENCODING=utf-8 and set PYTHONUTF8=1 in each terminal session before running any oc4idskit commands. To set these environment variables for all future sessions, run setx PYTHONIOENCODING utf-8 and setx PYTHONUTF8 1.
Inputs¶
To process a remote file:
curl <url> | oc4idskit <command>
To process a local file:
cat <path> | oc4idskit <command>
The inputs can be concatenated JSON or JSON arrays.
Options¶
Optional arguments for all commands are:
- --encoding ENCODING
the file encoding
- --ascii
print escape sequences instead of UTF-8 characters
- --pretty
pretty print output
- --root-path ROOT_PATH
the path to the items to process within each input
See the guidance for handling edge cases in OCDS. You can use the same approaches with OC4IDS data.
split-project-packages¶
Reads project packages from standard input, and prints smaller project packages for each.
Mandatory positional arguments:
sizethe number of projects per package
cat tests/fixtures/oc4ids/project_package.json | oc4idskit split-project-packages 1 | split -l 1 -a 4
The split command will write files named xaaaa, xaaab, xaaac, etc. Don’t combine the OC4IDS Kit --pretty option with the split-project-packages command.
combine-project-packages¶
Reads project packages from standard input, collects projects, and prints one project package.
If the --publisher-* options aren’t used, the output package will have the same publisher as the last input package.
Optional arguments:
- --uri URL
set the project package’s
urito this value- --published-date PUBLISHED_DATE
set the project package’s
publishedDateto this value- --version VERSION
set the project package’s
versionto this value- --publisher-name PUBLISHER_NAME
set the project package’s
publisher’snameto this value- --publisher-uri PUBLISHER_URI
set the project package’s
publisher’surito this value- --publisher-scheme PUBLISHER_SCHEME
set the project package’s
publisher’sschemeto this value- --publisher-uid PUBLISHER_UID
set the project package’s
publisher’suidto this value- --fake
set the project package’s required metadata to dummy values
cat tests/fixtures/project_package_split.json | oc4idskit combine-project-packages > out.json
If you need to create a single package that is too large to hold in your system’s memory, please comment on this issue.
For the Python API, see oc4idskit.combine.combine_project_packages().
Note
A warning is issued if a package’s "projects" field isn’t set.
convert-from-ocds¶
Reads individual releases or release packages from standard input, and prints a single project conforming to the Open Contracting for Infrastructure Data Standards (OC4IDS). It assumes all inputs belong to the same project.
You can refer to the documentation of the mapping between OCDS and OC4IDS.
Optional arguments:
- --project-id PROJECT_ID
set the project’s
idto this value- --all-transforms
run all optional transforms
- --transforms OPTIONS
comma-separated list of optional transforms to run
- --package
wrap the project in a project package
- --uri URI
if
--packageis set, set the project package’surito this value- --published-date PUBLISHED_DATE
if
--packageis set, set the project package’spublishedDateto this value- --version VERSION
if
--packageis set, set the project package’sversionto this value- --publisher-name PUBLISHER_NAME
if
--packageis set, set the project package’spublisher’snameto this value- --publisher-uri PUBLISHER_URI
if
--packageis set, set the project package’spublisher’surito this value- --publisher-scheme PUBLISHER_SCHEME
if
--packageis set, set the project package’spublisher’sschemeto this value- --publisher-uid PUBLISHER_UID
if
--packageis set, set the project package’spublisher’suidto this value- --fake
if
--packageis set, set the project package’s required metadata to dummy values
cat releases.json | oc4idskit convert-from-ocds > out.json
Transforms¶
The transforms that are run are described here.
additional_classifications,description,sector,title: populate top-level fields with their equivalents fromplanning.projectadministrative_entity,public_authority_role,procuring_entity,suppliers: populate thepartiesfield according to the partyrolebudget: populatesbudget.amountwith its equivalentbudget_approval,environmental_impact,land_and_settlement_impactandproject_scope: populate thedocumentsfield fromplanning.documentsaccording to thedocumentTypecontracting_process_setup: Sets up thecontractingProcessesarray of objects withid,summary,releasesandembeddedReleases. Some of the other transforms depend on this, so it is run firstcontract_period: populates thesummary.contractPeriodfield with appropriate values fromawardsortendercontract_price: populates thesummary.contractValuefield with the sum of allawards.valuefields where the currency is the samecost_estimate: populates thesummary.tender.costEstimatefield with the appropriatetender.valuecontract_process_description: populates thesummary.descriptionfield from appropriate values incontracts,awardsortendercontract_status: populates thesummary.statusfield using thecontractingProcessStatuscodelist.contract_title: populatessummary.titlefrom the title field inawards,contractsortenderfinal_audit: populate thedocumentsfield fromcontracts.implementation.documentsaccording to thedocumentTypefunding_sources: updatespartieswith organizations havingfunderin theirrolesor fromplanning.budgetBreakdown.sourcePartylocation: populates thelocationsfield with an array of location objects fromplanning.projects.locationsprocurement_process: populates the.summary.tender.procurementMethodand.summary.tender.procurementMethodDetailsfields with their equivalents fromtenderpurpose: populates thepurposefield fromplanning.rationale
Optional transforms¶
Some transforms are not run automatically, but only if set. The following transforms are included if they are listed in using the --transforms argument (as part of a comma-separated list) or if --all-transforms is passed.
buyer_role: updates thepartiesfield with parties that havebuyerin theirrolesdescription_tender: populate thedescriptionfield fromtender.descriptionif no other is availablelocation_from_items: populate thelocationsfield fromdeliveryLocationordeliveryAddressintender.itemsif no other is availableproject_scope_summary: updatessummary.tenderwithitemsandmilestonesfromtenderpurpose_needs_assessment: populate thedocumentsfield fromplanning.documentsaccording to thedocumentTypeneedsAssessmenttitle_from_tender: populate thetitlefield fromtender.titleif no other is available
Transformation Notes¶
Most transforms follow the logic in the mapping documentation. However, there is some room for interpretation in some of the mappings, so here are some notes about these interpretations.
Differing text across multiple contracting process¶
planning/project/title, project/planning/description (planning and budget extension):
If there are any contradictions i.e one contract says the title is different from another a warning is raised and the field is ignored in that case. If all contracting processes agree (when the fields exists in them) then the value is still used.
tender/title, tender/description, /planning/rationale:
If there a multiple contradicting process then we concatenate the strings and put the ocid in angle brackets like:
<someocid> a tender description <anotherocid> another description
If there is only one contracting processes then the ocid part is omitted.
Parties ID across multiple contracting processes¶
When parties/id from different contracting processes are conflicting and also if there are parties in multiple contracting processes that are the same, we need to identify which are in fact the same party.
The logic that the transforms do to work out matching parties:
If all
parties/idare unique across contracting processes then do nothing and add all parties to the project.If there are conflicting parties/id then look at the
identifierfield and if there areschemeandidmake an id ofsomescheme-someidand use that in order to match parties across processes. If there are different roles then add them to the same party. Use the other fields from the first party found with this id.If there is no
identifierthen make up a new auto increment number and use that as theid. This means the original IDs get replaced and are lost in the mappingIf there is no
identifierand all fields apart fromrolesandidare the same across parties then treat that as a single party and add the roles together and use a single generatedid.
Document ID across multiple contracting processes¶
If there are are only unique project/documents/id keep the ids the same. Otherwise create a new auto-increment for all docs. This means the original ``documents/id`` are lost
Project Sector¶
Sectors are gathered from planning/project/sector and it gets all unique scheme and id of the form <scheme>-<id> and adds them to the sector array. This could mean that the sectors generated are not in the Project Sector Codelist.
Release Links¶
contractingProcesses/releases within OC4IDS has link to a releases via a URL. This URL will be generated if OCDS release packages are supplied and a uri is in the package data. However, if this is not case the transform adds an additional field contractingProcesses/embeddedReleases which contains all releases supplied in their full.
Project Scope Summary¶
If --all-transforms is set or if project_scope_summary is included in --transforms it copies over all tender/items and tender/milestones to contractingProcess/tender. This is to give the output enough information in order to infer project scope.