Command-Line Interface¶
To see all commands available, run:
oc4idskit --help
Users on Windows should run set PYTHONIOENCODING=utf-8
and set PYTHONUTF8=1
in each terminal session before running any oc4idskit
commands. To set these environment variables for all future sessions, run setx PYTHONIOENCODING utf-8
and setx PYTHONUTF8 1
.
Inputs¶
To process a remote file:
curl <url> | oc4idskit <command>
To process a local file:
cat <path> | oc4idskit <command>
The inputs can be concatenated JSON or JSON arrays.
Options¶
Optional arguments for all commands are:
- --encoding ENCODING
the file encoding
- --ascii
print escape sequences instead of UTF-8 characters
- --pretty
pretty print output
- --root-path ROOT_PATH
the path to the items to process within each input
See the guidance for handling edge cases in OCDS. You can use the same approaches with OC4IDS data.
split-project-packages¶
Reads project packages from standard input, and prints smaller project packages for each.
Mandatory positional arguments:
size
the number of projects per package
cat tests/fixtures/oc4ids/project_package.json | oc4idskit split-project-packages 1 | split -l 1 -a 4
The split
command will write files named xaaaa
, xaaab
, xaaac
, etc. Don’t combine the OC4IDS Kit --pretty
option with the split-project-packages
command.
combine-project-packages¶
Reads project packages from standard input, collects projects, and prints one project package.
If the --publisher-*
options aren’t used, the output package will have the same publisher as the last input package.
Optional arguments:
- --uri URL
set the project package’s
uri
to this value- --published-date PUBLISHED_DATE
set the project package’s
publishedDate
to this value- --version VERSION
set the project package’s
version
to this value- --publisher-name PUBLISHER_NAME
set the project package’s
publisher
’sname
to this value- --publisher-uri PUBLISHER_URI
set the project package’s
publisher
’suri
to this value- --publisher-scheme PUBLISHER_SCHEME
set the project package’s
publisher
’sscheme
to this value- --publisher-uid PUBLISHER_UID
set the project package’s
publisher
’suid
to this value- --fake
set the project package’s required metadata to dummy values
cat tests/fixtures/project_package_split.json | oc4idskit combine-project-packages > out.json
If you need to create a single package that is too large to hold in your system’s memory, please comment on this issue.
For the Python API, see oc4idskit.combine.combine_project_packages()
.
Note
A warning is issued if a package’s "projects"
field isn’t set.
convert-from-ocds¶
Reads individual releases or release packages from standard input, and prints a single project conforming to the Open Contracting for Infrastructure Data Standards (OC4IDS). It assumes all inputs belong to the same project.
You can refer to the documentation of the mapping between OCDS and OC4IDS.
Optional arguments:
- --project-id PROJECT_ID
set the project’s
id
to this value- --all-transforms
run all optional transforms
- --transforms OPTIONS
comma-separated list of optional transforms to run
- --package
wrap the project in a project package
- --uri URI
if
--package
is set, set the project package’suri
to this value- --published-date PUBLISHED_DATE
if
--package
is set, set the project package’spublishedDate
to this value- --version VERSION
if
--package
is set, set the project package’sversion
to this value- --publisher-name PUBLISHER_NAME
if
--package
is set, set the project package’spublisher
’sname
to this value- --publisher-uri PUBLISHER_URI
if
--package
is set, set the project package’spublisher
’suri
to this value- --publisher-scheme PUBLISHER_SCHEME
if
--package
is set, set the project package’spublisher
’sscheme
to this value- --publisher-uid PUBLISHER_UID
if
--package
is set, set the project package’spublisher
’suid
to this value- --fake
if
--package
is set, set the project package’s required metadata to dummy values
cat releases.json | oc4idskit convert-from-ocds > out.json
Transforms¶
The transforms that are run are described here.
additional_classifications
,description
,sector
,title
: populate top-level fields with their equivalents fromplanning.project
administrative_entity
,public_authority_role
,procuring_entity
,suppliers
: populate theparties
field according to the partyrole
budget
: populatesbudget.amount
with its equivalentbudget_approval
,environmental_impact
,land_and_settlement_impact
andproject_scope
: populate thedocuments
field fromplanning.documents
according to thedocumentType
contracting_process_setup
: Sets up thecontractingProcesses
array of objects withid
,summary
,releases
andembeddedReleases
. Some of the other transforms depend on this, so it is run firstcontract_period
: populates thesummary.contractPeriod
field with appropriate values fromawards
ortender
contract_price
: populates thesummary.contractValue
field with the sum of allawards.value
fields where the currency is the samecost_estimate
: populates thesummary.tender.costEstimate
field with the appropriatetender.value
contract_process_description
: populates thesummary.description
field from appropriate values incontracts
,awards
ortender
contract_status
: populates thesummary.status
field using thecontractingProcessStatus
codelist.contract_title
: populatessummary.title
from the title field inawards
,contracts
ortender
final_audit
: populate thedocuments
field fromcontracts.implementation.documents
according to thedocumentType
funding_sources
: updatesparties
with organizations havingfunder
in theirroles
or fromplanning.budgetBreakdown.sourceParty
location
: populates thelocations
field with an array of location objects fromplanning.projects.locations
procurement_process
: populates the.summary.tender.procurementMethod
and.summary.tender.procurementMethodDetails
fields with their equivalents fromtender
purpose
: populates thepurpose
field fromplanning.rationale
Optional transforms¶
Some transforms are not run automatically, but only if set. The following transforms are included if they are listed in using the --transforms
argument (as part of a comma-separated list) or if --all-transforms
is passed.
buyer_role
: updates theparties
field with parties that havebuyer
in theirroles
description_tender
: populate thedescription
field fromtender.description
if no other is availablelocation_from_items
: populate thelocations
field fromdeliveryLocation
ordeliveryAddress
intender.items
if no other is availableproject_scope_summary
: updatessummary.tender
withitems
andmilestones
fromtender
purpose_needs_assessment
: populate thedocuments
field fromplanning.documents
according to thedocumentType
needsAssessment
title_from_tender
: populate thetitle
field fromtender.title
if no other is available
Transformation Notes¶
Most transforms follow the logic in the mapping documentation. However, there is some room for interpretation in some of the mappings, so here are some notes about these interpretations.
Differing text across multiple contracting process¶
planning/project/title, project/planning/description (planning and budget extension):
If there are any contradictions i.e one contract says the title is different from another a warning is raised and the field is ignored in that case. If all contracting processes agree (when the fields exists in them) then the value is still used.
tender/title, tender/description, /planning/rationale:
If there a multiple contradicting process then we concatenate the strings and put the ocid in angle brackets like:
<someocid> a tender description <anotherocid> another description
If there is only one contracting processes then the ocid part is omitted.
Parties ID across multiple contracting processes¶
When parties/id
from different contracting processes are conflicting and also if there are parties in multiple contracting processes that are the same, we need to identify which are in fact the same party.
The logic that the transforms do to work out matching parties:
If all
parties/id
are unique across contracting processes then do nothing and add all parties to the project.If there are conflicting parties/id then look at the
identifier
field and if there arescheme
andid
make an id ofsomescheme-someid
and use that in order to match parties across processes. If there are different roles then add them to the same party. Use the other fields from the first party found with this id.If there is no
identifier
then make up a new auto increment number and use that as theid
. This means the original IDs get replaced and are lost in the mappingIf there is no
identifier
and all fields apart fromroles
andid
are the same across parties then treat that as a single party and add the roles together and use a single generatedid
.
Document ID across multiple contracting processes¶
If there are are only unique project/documents/id keep the ids the same. Otherwise create a new auto-increment for all docs. This means the original ``documents/id`` are lost
Project Sector¶
Sectors are gathered from planning/project/sector
and it gets all unique scheme
and id
of the form <scheme>-<id>
and adds them to the sector
array. This could mean that the sectors generated are not in the Project Sector Codelist.
Release Links¶
contractingProcesses/releases
within OC4IDS has link to a releases via a URL. This URL will be generated if OCDS release packages are supplied and a uri
is in the package data. However, if this is not case the transform adds an additional field contractingProcesses/embeddedReleases
which contains all releases supplied in their full.
Project Scope Summary¶
If --all-transforms
is set or if project_scope_summary
is included in --transforms
it copies over all tender/items
and tender/milestones
to contractingProcess/tender
. This is to give the output enough information in order to infer project scope.