Tutorial
Requirements
- Python >=3
- Psycopg2
- PostgreSQl >= 10
Installation
Windows
Use pip to install the packages in the virtual environment:
pip install psycopg2
pip install pgOperations
Linux distributions
In Linux it is good practice always use Python virtual environments to install new Python libraries. To use Python virtual environments you must install the venv
library:
NOTE: Make sure, in the bellow listings, to replace the
X
ofpython3.X
for your Python distribution version.
sudo apt-get install python3.X-venv
Navigate to the folder where you want to create the virtual environment, for example the venvs
folder, and type:
python3.X -m venv pgOperations
Activate the virtual environment to install the packages:
source pgOperations/bin/activate
Install the psycopg2 dependency:
(pgOperations)$pip install psycopg2
In Linux distributions may be the installation of psycopg2
fail. In this cases you must install the following before:
sudo apt-get install libpq-dev -y
sudo apt-get install build-essential -y
sudo apt-get install python3.X-dev -y
Now you can proceed with the installation of psycopg2
.
(pgOperations)$pip install psycopg2
Collecting psycopg2
Building wheels for collected packages: psycopg2
Building wheel for psycopg2 (setup.py) ... error
...
...
Installing collected packages: psycopg2
Running setup.py install for psycopg2 ... done
Successfully installed psycopg2-2.9.5
And install pgOperations
:
(pgOperations)$pip install pgOperations
You are ready to use the library to edit data in PostgreSQL and PostGIS.
Summary
There are mainly three classes whose methods are useful: PgOperations, to edit data, PgCounters, to manage counters and PgDatabases, to manage databases. The rest of the classes are auxiliary classes to arrange the parameters that the useful classes need.
Create a database
To create a database it is necessary first to have an opened connection. Create a regular psycopg2 connection to the 'postgres' database:
conn=psycopg2.connect(database="postgres", user="postgres",
password="postgres", host="localhost", port=5432)
Create a PgConnection instance:
pgc = pg.PgConnection(conn)
Create a PgDatabases instance. This class allow to create and delete databases:
pgdb=pg.PgDatabases(pgConnect=pgc)
Create the database:
pgc2=pgdb.createDatabase(databaseName="pgoperationstest",
addPostgisExtension=True,closeNewConnection=False)
The createDatabase method returns an opened PgConnect instance, connected to the new database, and ready to use with the main class of thlis library PgOperations.
Create an example table
Execute regular SQL code to create an scheme and a table in the pgoperationstest
database, continuing with the previous listing. There is also an utility for creating tables. See
PgOperations.pgCreateTable.
pgc2.cursor.execute("create schema d")
pgc2.cursor.execute("create table d.points (gid serial primary key,
description varchar, depth double precision,
geom geometry('POINT',25831))")
Commit the changes and close the connections to the databases postgres
and pgoperationstest
:
pgc2.commit()
pgc2.disconnect()
pgc.disconnect()
Edit data
The examples in this section use the table d.points
created in the create a table section. This table has the fields gid
, description
, deph
and geom
fields. The geom
field is a PostGIS geomerty field type, of type POINT
in the SRC 25831
.
NOTE: It is not necessary to have a geometry field in the tables to use the pgOperations module.
Also, in the next examples it is supposed the pgOperations
module has been imported,
stored in the pg
variable, and there is
a PgOperations instance stored in the variable called ´pgo´:
from pgOperations import pgOperations as pg
oCon=pg.PgConnect(database="pgoperationstest", user="postgres",
password="postgres", host="localhost", port="5432")
pgo=pg.PgOperations(pgConnection=oCon,global_print_queries=True))
The parameter global_print_queries=True
indicates that all the methods of the pgo
object,
created in the previous listing, must print a detailed information: query, parameters,
values, and result. This mode is to know what is happening, for debugging purposes.
NOTE: This library supposes all python dictionary keys match with the table field names.
Insert
To insert you can use the method PgOperations.pgInsert. In the next sections you will see different use cases.
Example without PostGIS geometry field
Example with automatic generation of the SQL expression, using a dictionary, where the key names must be match with the table field names. The order is indifferent. In this case there is not geometry field:
d={"description": "water well", "depth": 12.15}
fieldsAndValues=pg.FieldsAndValues(d=d)
resp=pgo.pgInsert(table_name="d.points", fieldsAndValues=fieldsAndValues,
str_fields_returning="gid")
print(resp)
The outputs are the following:
pgInsert
Query: insert into d.points (description,depth) values (%s,%s) returning gid
Values: ['water well', 12.15]
[{'gid': 1}]
Below the prints are explained:
-
The first three prints are printed because the sentence
global_print_queries
was set toTrue
on creating the objectpgo
. It prints the SQL sentence to execute with aPsycopg2
cursor:'insert into d.points (description,depth) values (%s,%s) returning gid'
. The method also prints the values to be used:['water well', 12.15]
. -
The last print
[{'gid': 1}]
shows the returned value. The returned value can be an empty list, or several fields values in a dictionary. The values to be returned are specified in the string parameterstr_fields_returning="gid"
.
Example with a PostGIS geometry field
In this example the coordinates of the geometry are in the SRC EPSG 25830, but the table requires the SRC EPSG 25831. The example reprojects the coordinates to be able to be inserted in the table. This is not necessary if the original coordinates of the geometry match with the SRC of the geometry field of the table.
d={"description": "water well", "depth": 12.15, "geom": "POINT(100 200)"}
geometryFieldOptions=pg.GeometryFieldOptions(geom_field_name="geom",
epsg='25830',epsg_to_reproject="25831")
fieldsAndValues=pg.FieldsAndValues(d=d, list_fields_to_remove=["depth"],
geometryFieldOptions=geometryFieldOptions)
resp=pgo.pgInsert(table_name="d.points", fieldsAndValues=fieldsAndValues,
str_fields_returning="gid")
print(resp)
The outputs are the following:
pgInsert
Query: insert into d.points (description,geom) values (%s,st_transform(st_geometryfromtext(%s,25830),25831)) returning gid
Values: ['water well', 'POINT(100 200)']
[{'gid': 2}]
Example generating the expressions manually
This example manually generates the expressions using the class FieldsAndValuesBase, so you are free to write the SQL sentence, for more complicated cases.
fieldsAndValuesBase=pg.FieldsAndValuesBase(
str_field_names="depth, description, geom",
list_field_values=[12.15, "water well","POINT(100 200)"],
str_s_values="%s,%s,st_transform(st_geometryfromtext(%s,25830),25831)")
resp=pgo.pgInsert(table_name="d.points", fieldsAndValues=fieldsAndValuesBase,
str_fields_returning="gid")
print(resp)
Result:
pgInsert
Query: insert into d.points (depth, description, geom) values (%s,%s,st_transform(st_geometryfromtext(%s,25830),25831)) returning gid
Values: [12.15, 'water well', 'POINT(100 200)']
[{'gid': 3}]
Update
To update a table you can use the method PgOperations.pgUpdate. In the next sections you will see different use cases.
Example using a Python dictionary
The following example updates al the rows where gid=1
, and only updates the field geom
because the
field description
is elimitated:
d={"description": "water well2", "geom": "POINT(300 300)"}
geometryFieldOptions=pg.GeometryFieldOptions(
geom_field_name="geom",epsg=25831
)
fieldsAndValues=pg.FieldsAndValues( d=d,
list_fields_to_remove=['description'],
geometryFieldOptions=geometryFieldOptions
)
whereClause=pg.WhereClause(where_clause="gid=%s", where_values_list=[1])
resp=pgo.pgUpdate(table_name="d.points",
fieldsAndValues=fieldsAndValues,
whereClause=whereClause)
print (resp)
The result of the previous listing is:
pgUpdate
Query: update d.points set (geom) = row(st_geometryfromtext(%s,25831)) where gid=%s
where_clause: gid=%s
where_values_list [1]
New field values: ['POINT(300 300)']
Number of rows updated: 1
1
The returned value was 1
, therefore only one row has been updated.
Example generating the SQL expression manually
Next example updates the fields description
and geom
, generating the expression
manually:
fieldsAndValuesBase=pg.FieldsAndValuesBase(
str_field_names="description, geom",
list_field_values=["water well updated","POINT(300 300)"],
str_s_values="%s,st_transform(st_geometryfromtext(%s,25830),25831)")
whereClause=pg.WhereClause(where_clause="gid=%s", where_values_list=[1])
resp=pgo.pgUpdate(table_name="d.points",
fieldsAndValues=fieldsAndValuesBase,
whereClause=whereClause
)
print (resp)
The result of the previous listing is:
pgUpdate
Query: update d.points set (description, geom) = row(%s,st_transform(st_geometryfromtext(%s,25830),25831)) where gid=%s
where_clause: gid=%s
where_values_list [1]
New field values: ['water well updated', 'POINT(300 300)']
Number of rows updated: 1
1
One row has been updated.
Select
To select you can use the method PgOperations.pgSelect. In the next listing you will find an example.
The pgSelect method returns the selected rows as list of dictionaries, or as list of tuples. Each element of the list represents a selected row.
Example:
whereClause=pg.WhereClause(where_clause='gid > %s and gid < %s',
where_values_list=[0, 3])
resp=pgo.pgSelect(table_name="d.points",
string_fields_to_select='gid,depth,description,st_astext(geom)',
whereClause=whereClause, print_query=False, orderBy='gid desc')
print(resp)
The result of the previous command is:
pgSelect
Query: SELECT array_to_json(array_agg(registros)) FROM
(select gid,depth,description,st_astext(geom) from d.points where gid > %s and gid < %s order by gid desc limit 100) as registros
where_clause: gid > %s and gid < %s
where_values_list [0, 3]
Num of selected rows: 2
[
{'gid': 2, 'depth': None, 'description': 'water well', 'st_astext': 'POINT(-673652.1897909392 202.79364615897794)'},
{'gid': 1, 'depth': 12.15, 'description': 'water well updated', 'st_astext': 'POINT(-673449.3960596526 304.1894476513122)'}
]
In the above example, if you set the parameter get_rows_as_dicts
to false,
the result is a list of tuples:
[
(2, None, 'water well', 'POINT(-673652.1897909392 202.79364615897794)'),
(1, None, 'water well updated', 'POINT(-673449.3960596526 304.1894476513122)')
]
Delete
To delete you can use the method PgOperations.pgDelete. In the next listing you will find an example.
whereClause=pg.WhereClause(where_clause='gid < %s', where_values_list=[3])
resp=pgo.pgDelete(table_name="d.points",whereClause=whereClause,print_query=False)
print(resp)
The result of the above example is:
pgDelete
Query: delete from d.points where gid < %s
where_clause: gid < %s
where_values_list [3]
Number of rows deleted: 2
2
Other utilities
Create table
To create the new table you can use the PgOperations.createTable utility.
r=pgo.pgCreateTable(table_name_with_schema="d.customers",
fields_definition="gid serial, name varchar, img varchar",
delete_table_if_exists= True,print_query=False)
Delete rows and files
This utility deletes the selected rows, and their associated files in the hard disk.
The rows have to have a field with the file names to delete. The file names
can contain an absolute path, or a relative path. It is possible to complete
relative paths with the parameter base_path
.
see the method PgOperations.pgDeleteWithFiles for mor details about the parameters.
This method only has been tested in Linux systems.
As an example a new table is needed.
r=pgo.pgCreateTable(table_name_with_schema="d.customers",
fields_definition="gid serial, name varchar, img varchar",
delete_table_if_exists= True,print_query=False)
Now let insert some data:
d1={"name": "customer 1", "img": "image1.jpg"}
d2={"name": "customer 2", "img": "image2.jpg"}
d3={"name": "customer 3", "img": "image3.jpg"}
l=[d1,d2,d3]
for d in l:
fieldsAndValues=pg.FieldsAndValues(d=d)
r=pgo.pgInsert(table_name="d.customers",
fieldsAndValues=fieldsAndValues,
str_fields_returning="gid")
print("Customer inserted. gid: ", r[0]["gid"])
Suppose the images are im the folder /home/user/app/media/customers/img
.
To delete all rows and all images:
r=pgo.pgDeleteWithFiles(table_name="d.customers",field_name_with_file_name="img",
base_path="/home/user/app/media/customers/img")
In the previous example as the whereClause
is None, all the rows are selected to be deleted.
If in the folder /home/user/app/media/customers/img
there are
only the files image1.jpg
and image2.jpg
, the result will be:
{'numOfRowsDeleted': 3, 'deletedFileNames': ["image1.jpg", "image2.jpg"],
'notDeletedFilenames': ['image3.jpg'],
'base_path': '/home/user/app/media/customers/img'}
Get table field names
This utility returns the table field names, as a string, or as a list. Besides
if the table has a geometry field, e.g. geom
, this utility can return this
field name as geom
, st_astext(geom)
, st_asgeojson(geom)
, st_transform(st_asgeojson(geom),givenEPSG)
, or st_transform(st_astext(geom),givenEPSG)
. The objetive is, the output of
this function, could be used as input for the parameter list_fields_to_select
of the methods PgOperations.pgSelect and PgOperations.pgUpdate
It is very common not to update all the field values, as some of them are automatically set
by the database, because they have default values, commonly serials
, or timestamp
field types.
Because that, this utility allow to remove some fields of the output. Next example returns
all field names in a list:
r=pgo.pgGetTableFieldNames('d.points')
Results:
pgGetTableFieldNames
Query: SELECT column_name FROM information_schema.columns WHERE table_schema=%s and table_name = %s
Fields list: ['gid', 'description', 'depth', 'geom']
Field names: ['gid', 'description', 'depth', 'geom']
Example to get all the field names, except description
, getting the geometry as geojson and
getting the result as string:
gf=pg.SelectGeometryFormat()
gfo=pg.SelectGeometryFieldOptions(geom_field_name='geom',
select_geometry_format=gf.geojson,
epsg_to_reproject='25831')
r=pgo.pgGetTableFieldNames('d.points',gfo,
list_fields_to_remove=['description'],returnAsString=True)
print('Field names: ', r)
Result:
pgGetTableFieldNames
Query: SELECT column_name FROM information_schema.columns WHERE table_schema=%s and table_name = %s
Fields list: ['gid', 'depth', 'st_asgeojson(st_transform(geom,25831))']
Field names: gid,depth,st_asgeojson(st_transform(geom,25831))
Next example shows how to use the output of this method as input of PgOperations.pgSelect:
gf=pg.SelectGeometryFormat()
gfo=pg.SelectGeometryFieldOptions(geom_field_name='geom',select_geometry_format=gf.geojson, epsg_to_reproject='25831')
fieldNames=pgo.pgGetTableFieldNames('d.points',gfo,list_fields_to_remove=['description'],returnAsString=True)
print('Field names: ', fieldNames)
wc=pg.WhereClause(where_clause='gid=%s',where_values_list=[3])
res=pgo.pgSelect(table_name='d.points', string_fields_to_select=fieldNames,whereClause=wc)
print('Selection result: ', res)
Results:
pgGetTableFieldNames
Query: SELECT column_name FROM information_schema.columns WHERE table_schema=%s and table_name = %s
Fields list: ['gid', 'depth', 'st_asgeojson(st_transform(geom,25831))']
Field names: gid,depth,st_asgeojson(st_transform(geom,25831))
pgSelect
Query: SELECT array_to_json(array_agg(registros)) FROM (select gid,depth,st_asgeojson(st_transform(geom,25831)) from d.points where gid=%s limit 100) as registros
where_clause: gid=%s
where_values_list [3]
Num of selected rows: 1
Selection result: [{'gid': 3, 'depth': 12.15, 'st_asgeojson': '{"type":"Point","crs":{"type":"name","properties":{"name":"EPSG:25831"}},"coordinates":[-673652.189790939,202.793646159]}'}]
Table exists
Returns True or False, depending whether or not the table exists. See the PgOperations.pgTableExists method documentation for more details:
Example:
res=pgo.pgTableExists(table_name_with_schema='d.points')
print('Table exists: ', res)
Result:
Table exists
pgTableExists
Query: SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = %s AND table_name = %s)
Table exists: True
Value exists in field
It is very common to check if a value exists in a column. For example in the case of
users, or emails. This function returns True
or False
,
depending whether or not the value exists a column.
See the PgOperations.pgValueExists method documentation for more details:
Example:
res=pgo.pgValueExists(table_name_with_schema='d.points',field_name='gid',field_value= 3)
Result:
pgValueExists
Query: SELECT exists (SELECT gid FROM d.points WHERE gid = %s LIMIT 1)
Field name: 'gid'. Field value: 3
Exists: True
Value exists: True
Manage counters
This module has a class, called PgCounters to manage counters. See the class method documentations to get more details.
Add a counter
Use the method PgCounters.addCounter to create a counter:
counter_name = 'c1'
c=pg.PgCounters(pgo)
c.addCounter(counter_name,counter_name + ' description')
Results:
Add counter
pgTableExists #pgTable exists call to check if the table counters.counters exists
Query: SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = %s AND table_name = %s)
addCounter #add counter prints
Create sequence: create sequence counters.c1 as integer start with %s increment by %s
pgInsert #pgInsert call to insert in the table counters.counters
Query: insert into counters.counters (counter_name,counter_description) values (%s,%s)
Values: ['c1', 'c1 description']
As you can see the table counters.counters
is created to have a registry of the created counters.
The table contains the counter name, and one description. All the counters are sequences
created
in the schema counters
, therefore their names must be unique.
Increment a counter
Use the method PgCounters.incrementCounter to increment a counter:
v1=c.incrementCounter(counter_name)
print('Returned value 1: ',v1)
Results:
incrementCounter
Query: select nextval(%s)
Current counter value: 1
Returned value 1: 1
Get the current counter value
Use the method PgCounters.getCounterValue to get the counter value:
r1=c.getCounterValue(counter_name)
print('Returned value 3: ',r1)
Results:
getCounterValue
Quer: y select last_value from counters.c1
Current counter value: 1
Returned value 3: 1
Get all the counter name, description and values
Use the method PgCounters.getAllCounters to get all the counters, its description, and its values:
r=c.getAllCounters()
print('All counters: ',r)
The results, after having added another, counter and having incremented them, are the following:
...
#prints of other PgOperations method calls inside the method getAllCounters
...
All counters: [
{'gid': 17, 'counter_name': 'c1', 'counter_description': 'c1 description', 'value': 2},
{'gid': 18, 'counter_name': 'c2', 'counter_description': 'c2 description', 'value': 2}
]
Delete a counter
Use the method PgCounters.deleteCounter to delete a counter:
n=c.deleteCounter('c1')
And the results are the following:
pgDelete #the method calls pgDelete to remove the
corresponding row in counters.counters
Query: delete from counters.counters where counter_name=%s
where_clause: counter_name=%s
where_values_list ['c1']
Number of rows deleted: 1
deleteCounter #the deleteCounter prints start here
Query drop sequence if exists counters.c1
Sequences deleted: 1
As you can see in the previous listing, the associated counter sequence has being deleted,
as well as its corresponding row in the table counters.counters
.