Сопоставление типов данных oracle с postgresql
You should know SQL relatively well. Knowing the details of Oracle SQL and Postgres SQL are obviously tremendous advantages, but the hints in this document should quickly bring you up to speed on what the differences are.
If you're porting Oracle SQL to Postgres SQL for the ACS/pg, you should also be quite familiar with AOLserver Tcl, especially the AOLserver database APIs.
In this document, we're talking about:
- Oracle 10g to 11g (most stuff will work down to 8i)
- Oracle 12c is in some aspects different (Multitenent DB) but migration may be even easier
- Postgres 8.4, and sometimes this also works for earlier versions.
- Enterprise vs. Oracle RAC migrations
Oracle Enterprise has a more direct migration to PostgreSQL than does Oracle Real Application Clusters (RAC) in some cases. With RAC you may have multiple, separate, heavy-hitting, DML applications usually of the OLTP type connected to the same RAC Cluster, where RAC serves as a type of application farm. A common mistake with migrations from Oracle RAC farms is to associate all of the farm applications with one PostgreSQL Instance. The big picture that is missed here is ACTIVE-ACTIVE (Oracle RAC) and ACTIVE-PASSIVE (PG). While Oracle RAC can divvy up the applications and load balance them across the Nodes in the cluster, there is no such thing in PostgreSQL. So, the "right" solution without some re-architecture and/or use of 3rd party tools and extensions is to migrate the applications off of Oracle RAC one at a time to separate PostgreSQL instances, one heavy-hitting application per PostgreSQL Instance. The next two sections illustrate CPU and Memory factors that come into play with PostgreSQL being a host to 2 or more heavy-hitting applications.
Data Types
- Get link
- Other Apps
Outer Joins
Outer Joins in old (pre version 9i) Oracle work as follows:
where the (+) indicates that, if there is no row in table b that matches the correct item_id, the match should still happen, with an empty row from table b. In this case, for example, for all rows in table a where there is no matching row in b, a row will still be returned where a.field1 is correct, but b.field2 is null.
In Postgresql and Oracle 9i and younger:
In certain other cases where only aggregate values are pulled out of the outer-joined table, it's possible to not use a join at all. If the original query is:
then the Postgres query can look like:
where you've defined the function:
Starting with 9i Oracle also supports SQL 99 outer join syntax. Unfortunately some programmers still stick to the old syntax (even nearly 20 years after it was superseded). That's the reason why this chapter still is necessary.
CPU Contention
A PostgreSQL instance that has multiple heavy-hitting, DML applications connected to it can have CPU load problems due to the fact that we are stacking up or accumulating the concurrent, active transactions of multiple applications. A general rule in PostgreSQL is that as the number of active, concurrent transactions (pg_stat_activity.state = 'active') exceeds 2 times the number of CPUs, we begin to experience CPU load saturation. This would be expected with multiple, heavy-hitting applications working against the same PostgreSQL Instance.
Summary
Converting the NUMBER data type from Oracle to PostgreSQL is always tricky. It’s not recommended to convert all NUMBER data type columns to NUMERIC or DOUBLE PRECISION in PostgreSQL without a proper analysis of the source data. Having the appropriate data types helps improve performance. It pays long-term dividends by spending time upfront to determine the right data type for application performance. From the Oracle system tables, you get precision and scale of all the columns of all the tables for a user. With this metadata information, you can choose the target data type in PostgreSQL as INT, BIGINT, DOUBLE PRECISION, or NUMERIC.
However, you can’t always depend on this information. Although this metadata shows the scale of some columns as >0, the actual data might not have floating values. Look at the actual data in every table’s NUMBER columns (which are scale >0), and decide on the target columns’ data types.
Precision(m) | Scale(n) | Oracle | PostgreSQL |
0 | NUMBER(m,n) | INT | |
9 > m | 0 | NUMBER(m,n) | BIGINT |
m+n | n>0 | NUMBER(m,n) | DOUBLE PRECISION |
m+n > 15 | n>0 | NUMBER(m,n) | NUMERIC |
In the next post in this series, we cover methods to convert the data type in PostgreSQL after analysis is complete, and we discuss data types for the source Oracle NUMBER data type.
How to find the server is whether standby (slave) or primary(master) in Postgresql replication ?
ROWNUM and ROWID
Oracle's pseudocolumn ROWNUM returns a number which is assigned at the moment when data is read, but before the ORDER BY is executed. You might consider to replace it by using ROW_NUMBER() OVER (ORDER BY . ) which might work in many cases. Emulations with sequences are potentially slower.
Oracle's pseudocolumn ROWID returns the physical address of a table row, encoded in base64. In applications it can be used to (temporarily) cache the address of rows to locate them easier a second time. Postgres has ctid which serves the same purpose.
Sequence
Oracle и Postgress имеют разный синтаксис nextval от Sequence.
Oracle
Приводим к общему виду. Для этого создаем функции в Postgres и Oracle и везде переписываем на использование этой функции.
Oracle
Использовать напрямую функцию Postgres nextval('my_seq') не получиться, так как хотя Oracle и позволит создать данную функцию, но не позволит выбирать значения.
Текстовые поля
В Oracle нет неограниченного текстового поля, вместо этого используется Lob. В PostgreSQL есть специальный тип — text. Чтобы можно было использовать обе базы аннотации текстовых полей должны быть следующими
Для Postgres также используем самописный диалект, знающий о добавленных функциях
About the authors
Baji Shaik is a Consultant with AWS ProServe, GCC India. His background spans a wide depth and breadth of expertise and experience in SQL/NoSQL database technologies. He is a Database Migration Expert and has developed many successful database solutions addressing challenging business requirements for moving databases from on-premises to Amazon RDS and Aurora PostgreSQL / MySQL . He is an eminent author, having written several books on PostgreSQL . A few of his recent works include “ PostgreSQL Configuration“, “Beginning PostgreSQL on the Cloud”, and “ PostgreSQL Development Essentials“. Furthermore, he has delivered several conference and workshop sessions.
Sudip Acharya is a Sr. Consultant with the AWS ProServe team in India. He works with internal and external Amazon customers to provide guidance and technical assistance on database projects, helping them improve the value of their solutions when using AWS.
Deepak Mahto is a Consultant with the AWS Proserve Team in India. He has been working as Database Migration Lead, helping and enabling customers to migrate from commercial engines to Amazon RDS.His passion is automation and has designed and implemented multiple database or migration related tools.
Добавление отсутствующих в PostgreSQL функций
Чтобы не было необходимости в переписывании кода, просто создадим, отсутствующие в PostgreSQL, но которые есть в Oracle и используются в проекте.
nvl(timestamp with time zone, timestamp with time zone)
nvl2(date, date, date)
nvl2(integer, integer, integer)
nvl2(numeric, numeric, numeric)
nvl2(text, text, text)
nvl(timestamp with time zone, timestamp with time zone)
nvl2(timestamp, timestamp, timestamp)
nvl2(varchar, varchar, varchar)
trunc(timestamp with time zone, varchar)
Conversion to PostgreSQL DOUBLE PRECISION or NUMERIC
You can convert Oracle tables with columns of the data type NUMBER or INTEGER to PostgreSQL DOUBLE PRECISION or NUMERIC if they meet the following criteria:
- The data type is NUMERIC or NUMBER in Oracle
- DATA_PRECISION is NOT NULL
- DATA_SCALE is > 0 (float values)
You can use the following query on Oracle to find candidate columns of NUMERIC or INTEGER data type that can be converted to DOUBLE PRECISION or NUMERIC in PostgreSQL:
The following is an example output of the query:
The output contains the column COMMISSION_PCT from the EMPLOYEES table, which can be converted as DOUBLE PRECISION. The output also contains two columns ( LONGI and LATTI ) that can be NUMERIC because they have (precision + scale) > 15.
Changing the data type for all candidate columns may not have the same impact or performance gain. Most columns that are part of the key or index appear in joining conditions or in lookup, so changing the data type for those columns may have a big impact. The following are guidelines for converting the column data type in PostgreSQL, in order of priority:
- Consider changing the data type for columns that are part of a key (primary, unique, or reference)
- Consider changing the data type for columns that are part of an index
- Consider changing the data type for all candidate columns
Analysis method: Metadata-based data type conversion
By looking at the metadata information of Oracle tables with columns of the NUMBER or INTEGER data type, we can come up with the target data type recommendations.
Numeric Types
Oracle programs mostly use NUMBER. The appropriate data type in Postgres is decimal or numeric, both are equivalent. The limit for numbers in Postgres (up to 131072 digits before the decimal point; up to 16383 digits after the decimal point) is much higher than in Oracle and is internally stored in a similar way.
Oracle allows you to use INTEGER, REAL, DOUBLE PRECISION and FLOAT, but it maps them to NUMBER.
Later versions of Oracle have BINARY_FLOAT and BINARY_DOUBLE, which correspond to PostgreSQL's real and double precision.
substring
CLOBs
Postgres has decent CLOB support in the form of TEXT.
Script to find sessions that are blocking other sessions in PostgreSQL
Datatypes in Oracle and PostgreSQL
While there are many similarities between Oracle and PostgreSQL there are a couple of differences, which are quite noteworthy. The first thing many people might notice is: PostgreSQL has many more data types than Oracle. As of version 10.0 an empty PostgreSQL database will expose the staggering number of 92 data types. Of course not all of them are useful and many of them are purely internal used for internal purposes.
Still: At the end of the day there are just more data types, which are can be used by applications, which of course
Let us take a look and see, which types can be matched. The following table contains a lot of potential options:
Oracle type | Possible PostgreSQL types |
CHAR | char, varchar, text |
NCHAR | char, varchar, text |
VARCHAR | char, varchar, text |
VARCHAR2 | char, varchar, text, json |
NVARCHAR2 | char, varchar, text |
CLOB | char, varchar, text, json |
LONG | char, varchar, text |
RAW | uuid, bytea |
BLOB | bytea |
BFILE | bytea (read-only) |
LONG RAW | bytea |
NUMBER | numeric, float4, float8, char, varchar, text |
NUMBER(n,m) with m | numeric, float4, float8, int2, int4, int8, |
boolean, char, varchar, text | |
FLOAT | numeric, float4, float8, char, varchar, text |
BINARY_FLOAT | numeric, float4, float8, char, varchar, text |
BINARY_DOUBLE | numeric, float4, float8, char, varchar, text |
DATE | date, timestamp, timestamptz, char, varchar, text |
TIMESTAMP | date, timestamp, timestamptz, char, varchar, text |
TIMESTAMP WITH TIME ZONE | date, timestamp, timestamptz, char, varchar, text |
TIMESTAMP WITH | date, timestamp, timestamptz, char, varchar, text |
LOCAL TIME ZONE | |
INTERVAL YEAR TO MONTH | interval, char, varchar, text |
INTERVAL DAY TO SECOND | interval, char, varchar, text |
MDSYS.SDO_GEOMETRY | geometry (see “PostGIS support” below) |
A second important thing is “varchar2”: On the PostgreSQL side it can easily be mapped to varchar or text. A small detail is that in Oracle varchar2 can be the number of bytes or the number of characters. Consider the following example: VARCHAR2(20 BYTE) vs.VARCHAR2(10 CHAR). You have to think twice, what to do on the PostgreSQL side because in Postgres we are always talking about “characters”.
Transformers.ALIAS_TO_ENTITY_MAP
Если используется Transformers.ALIAS_TO_ENTITY_MAP то надо обязательно указывать возвращаемые типы. Oracle по умолчанию приведет ключи к верхнему регистру, PotgresSql к нижнему и несовпадение ключей исправляется только вручную.
Подготовка
В качестве утилиты для конвертации данных использовалась ora2pg. Процесс использования очень хорошо описан в статье.
Создаем проект в ora2pg, настраиваем проект, и генерируем схему.
Создаем в PostgreSQL схему 'ora_schema' и таблицы по файлу './schema/tables/table.sql'
Переключаем Hibernate в режим create и создаем еще одну эталонную схему 'hb_schema'. В случае использования view в проекте, число таблиц в разных схемах не сойдется. Hibernate сгенерирует полноценные таблицы вместо view и это следует учитывать.
PostgreSQL pgBadger
pgBadger is a PostgreSQL log analyzer built for speed with fully reports from your PostgreSQL log file. It's a single and small Perl script that outperforms any other PostgreSQL log analyzer. It is written in pure Perl and uses a JavaScript library (flotr2) to draw graphs so that you don't need to install any additional Perl modules or other packages. Furthermore, this library gives us more features such as zooming. pgBadger also uses the Bootstrap JavaScript library and the FontAwesome webfont for better design. Everything is embedded. pgBadger is able to autodetect your log file format (syslog, stderr or csvlog). It is designed to parse huge log files as well as gzip compressed files. See a complete list of features below. Supported compressed format are gzip, bzip2 and xz. For the xz format you must have an xz version upper than 5.05 that supports the --robot option. All charts are zoomable and can be saved as PNG images. You can also limit pgBadger to only report err
Analysis before data type conversion
Data type changes may have some side effects, which we discuss in this post. Before converting data types, you need to check with your service or application team if they have the following:
- Any dependency of these data types (INT and BIGINT) in application SQLs that require type casting
- Any change required in stored procedures or triggers for these data types
- Any dependency in Hibernate-generated code
- If these data type changes have any impact on future data growth
- Any plans in the future for these columns to store any fractional values
Without proper analysis and information on the application code and the database code objects, it’s not recommended to change the data types. If you do so, your application queries may start showing performance issues because they require a type casting internally from NUMERIC to INT or BIGINT. If you have code objects like procedures or triggers with these data types’ dependency, you need to make changes to those objects to avoid performance issues. Future data growth may impact these conversions as well. If you convert it to INT based on the current data, however, it may go beyond INT value in the future. Make sure you don’t have any plans to store fractional values after migration is complete. If that is the requirement, you need to choose a different data type than INT or BIGINT.
For this post, we examine two different methods to analyze the information to recommend INT, BIGINT, DOUBLE PRECISION, or NUMERIC:
- Metadata-based data type conversion
- Actual data-based data type conversion
The Dual Table
Oracle uses the table DUAL for selects where actually no table name would be necessary, since the FROM clause in Oracle is mandatory. In PostgreSQL we can omit the FROM clause at all. This table can be created in postgres as a view to ease porting problems. This allows code to remain somewhat compatible with Oracle SQL without annoying the Postgres parser.
In any case when migrating, if possible, just remove the "FROM DUAL" clause from the statement. Joins with dual are really rare - and peculiar.
Временные таблицы
В PostgreSQL в отличии от Oracle, временные таблицы создаются каждый раз и живут в рамках сессии. В Oracle только содержимое временных таблиц живет в каких-либо рамках, а сами таблицы созданы постоянно.
Из этого проистекают следующие проблемы:
1) При создании каждой сессии придется создавать временную таблицу. (данный момент можно упростить используя возможность сервера приложений выполнять инициализационный sql-блок при создании новой сессии (connection pool → advanced → Init SQL)
2) Валидация сущностей в hibernate производится в рамках настроенной схемы (если не указать схему, валидация будет производиться по всем доступным схемам). Т.к. в Postgre временные таблицы создаются в отдельных схемах, валидация провалится.
(Можно обойти создав реальную таблицу с такой же структурой. При нативных запросах работа будет производиться с временной таблицей, а не с обычной).
3) Работать с такой временной таблицей возможно только через нативные запросы, т.к. Hibernate во все генерируемые запросы добавляет название схемы (либо надо всю работу со временными таблицами вести через нативные запросы, либо отказываться от временных таблиц и использовать обычную с разграничением данных по уникальному ключу. Оба способа подразумевают переписывание функционала).
Чтобы продолжить работать с временными таблицами в Postgress, как в Oracle воспользуемся табличными функциями и созданием view на основе табличной функции. Подробности здесь
For those of you out there working with PostgreSQL in a professional way, migrating from Oracle to PostgreSQL might be one of the most beloved tasks available. One of the first things most people will notice, however, is that those datatypes available in Oracle might not be quite the same in PostgreSQL. This blog will try to shed some light and show, how things work.
Popular posts from this blog
Sysdate
How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database
In this post, I am sharing few important function for finding the size of database, table and index in PostgreSQL. Finding object size in postgresql database is very important and common. Is it very useful to know the exact size occupied by the object at the tablespace. The object size in the following scripts is in GB. The scripts have been formatted to work very easily with PUTTY SQL Editor. 1. Checking table size excluding table dependency: SELECT pg_size_pretty(pg_relation_size('mhrordhu_shk.mut_kharedi_audit')); pg_size_pretty ---------------- 238 MB (1 row) 2. Checking table size including table dependency: SELECT pg_size_pretty(pg_total_relation_size('mhrordhu_shk.mut_kharedi_audit')); pg_size_pretty ---------------- 268 MB (1 row) 3. Finding individual postgresql database size SELECT pg_size_pretty(pg_database_size('db_name')); 4. Finding individual table size for postgresql database -including dependency index: SELECT pg_size_pretty(pg_total_rel
Memory Contention
With multiple, separate, heavy-hitting, DML applications hitting the same PostgreSQL instance, we start to see inherent problems when unrelated SQL workloads (separate databases, schemas, and tables) compete for the same memory resources. Normally with a single application, one thing good about it is that a lot of the disk to memory activity is with the same heavily used tables. So you usually get your 95% to 99% cache hit ratio. But when multiple, separate SQL Workloads are at work within a single PostgreSQL instance with their own set of tables, you may begin to see contention for memory-resident pages between the separate SQL workloads. In that case you have to make sure you have enough OS memory and shared_buffers memory to handle the surge when multiple SQL workloads compete for the same paging resources. Anticipating different degrees of load activity at any one point in time between the competing SQL workloads makes tuning shared buffers much harder and perhaps impossible to tune for both at the same time without adding significantly more reserved memory to shared_buffers even though it may not need it all most of the time.
While the first statement after a COMMIT starts a new multi-statement transaction in Oracle RDBMS, Postgres operates in autocommit mode. Every piece of code doing some DML that is not to be committed immediately must start a transaction with a BEGIN statement. ROLLBACK and COMMIT have the same semantic meaning in both systems; also SAVEPOINTS mean the same. Postgres knows all the isolation levels Oracle knows (and a few more). In most cases the default isolation level of Postgres (Read Committed) will be sufficient.
There are a handful of grammar differences in Postgres for functionality that is actually the same. ACS/pg attempts to perform these changes automatically, leaving only the major functionality differences to be ported by hand. This is done by db_sql_prep which performs a number of regular expression substitutions on a piece of SQL.
Conversion to PostgreSQL INT or BIGINT
You can covert Oracle tables with columns of the data type NUMBER or INTEGER to PostgreSQL INT or BIGINT if they meet the following criteria:
- The data type is NUMERIC or INTEGER in Oracle
- DATA_PRECISION is NOT NULL (not a variable length NUMERIC)
- DATA_SCALE is 0 (integers and not float values)
- MAX_LENGTH is defined as
- If DATA_PRECISION is < 10, use INT. If DATA_PRECISION is 10–18, use BIGINT.`
You can use the following query on Oracle to find candidate columns of NUMERIC or INTEGER data type that can be converted to BIGINT or INT in PostgreSQL:
The following code is an example output of the preceding query, which shows precision and scale of NUMBER data types for a particular user:
The output shows four columns ( REGION_ID of the COUNTRIES table, CUSTOMER_ID of the CUSTOMER table, and ORDER_ID and VALUE of the ORDERS table) that have precision between 10–18, which can be converted to BIGINT in PostgreSQL. The remaining columns can be converted to INT.
Sequences
Oracle's sequence grammar is sequence_name.nextval.
Postgres's sequence grammar is nextval('sequence_name').
In Tcl, getting the next sequence value can be abstracted by calling [db_sequence_nextval $db sequence_name]. In case you need to include a sequence's value in a more complex SQL statement, you can use [db_sequence_nextval_sql sequence_name] which will return the appropriate grammar.
Comments
Post a Comment
Date and Time
Oracle mostly uses the data type DATE which is a combination of date + time. In most cases it should be sufficient to replace it with Postgres' TIMESTAMP. In some cases it will yield not exactly the correct result, because date consists only of second, minute, hour, day, month and year. No fractions of seconds, no DST, no time zones.
Oracle's TIMEZONE is very much similar to Postgres'.
Oracle's TIMESTAMP WITH TIME ZONE is different from PostgreSQL's in that it stores the time zone along with the value. PostgreSQL stores the values in UTC and converts them to the current session time zone upon display. If you need to retain the original time zone information, you have to store it separately.
Oracle only knows INTERVAL YEAR TO MONTH and INTERVAL DAY TO SECOND, so this can be used directly in Postgres too.
CONNECT BY
Postgres doesn't have connect by statements. It does, however, have WITH RECURSIVE. As WITH RECURSIVE is Turing-complete, it is simple to translate CONNECT BY statements into WITH RECURSIVE ones.
Sometimes CONNECT BY is used as a simple iterator:
may be equivalent to the Postgres Query
In this table is very useful at the time of database migration here i differentiated between oracle data type and postgreSQL data types
Oracle → PostgreSQL: Automatic conversion
Hans-Jürgen Schönig
Hans-Jürgen Schönig has experience with PostgreSQL since the 90s. He is CEO and technical lead of CYBERTEC, which is one of the market leaders in this field and has served countless customers around the globe since the year 2000.
Sysdate
Oracle uses the function sysdate to get the current date and time in server's timezone. Postgres uses 'now'::timestamp for current transaction start date and time in session's timezone, which ACS/pg has conveniently wrapped in a function named sysdate().
ACS/pg also includes a Tcl procedure named db_sysdate which should be used every time the sysdate term appears. Thus:
Functions
Oracle has more than 250 built-in single row functions and more than 50 aggregate functions. See Oracle Functions.
How to Return a Result Set from a PostgreSQL Stored Procedure
Both stored procedures and user-defined functions are created with CREATE FUNCTION statement in PostgreSQL. To return one or more result sets (cursors in terms of PostgreSQL), you have to use refcursor return type. Quick Example : -- Procedure that returns a single result set (cursor) CREATE OR REPLACE FUNCTION show_cities ( ) RETURNS refcursor AS $$ DECLARE ref refcursor; -- Declare a cursor variable BEGIN OPEN ref FOR SELECT city , state FROM cities; -- Open a cursor RETURN ref; -- Return the cursor to the caller END ; $$ LANGUAGE plpgsql; Overview : Return Multiple Result Sets Cursor Lifetime Until the end of transaction Auto-commit Must be off Transaction must be active so the caller can see a result set Important Note : The cursor remains open until the end of transaction, and since PostgreSQL works
An Oracle to PostgreSQL migration in the AWS Cloud can be a multistage process with different technologies and skills involved, starting from the assessment stage to the cutover stage. For more information about the migration process, see Database Migration—What Do You Need to Know Before You Start? and the following posts on best practices, including the migration process and infrastructure considerations, source database considerations, and target database considerations for the PostgreSQL environment.
In the migration process, data type conversion from Oracle to PostgreSQL is one of the key stages. Multiple tools are available on the market, such as AWS Schema Conversion Tool (AWS SCT) and Ora2PG, which help you with data type mapping and conversion. However, it’s always recommended that you do an upfront analysis on the source data to determine the target data type, especially when working with data types like NUMBER.
Many tools often convert this data type to NUMERIC in PostgreSQL. Although it looks easy to convert Oracle NUMBER to PostgreSQL NUMERIC, it’s not ideal for performance because calculations on NUMERIC are very slow when compared to an integer type. Unless you have a large value without scale that can’t be stored in BIGINT, you don’t need to choose the column data type NUMERIC. And as long as the column data doesn’t have float values, you don’t need to define the columns as the DOUBLE PRECISION data type. These columns can be INT or BIGINT based on the values stored. Therefore, it’s worth the effort to do an upfront analysis to determine if you should convert the Oracle NUMBER data type to INT, BIGINT, DOUBLE PRECISION, or NUMERIC when migrating to PostgreSQL.
This series is divided into two posts. In this post, we cover two analysis methods to define the target data type column in PostgreSQL depending on how the NUMBER data type is defined in the Oracle database and what values are stored in the columns’ data. In the second post, we cover how to change the data types in the target PostgreSQL database after analysis using the AWS SCT features and map data type using transformation.
Before we suggest which of these data types in PostgreSQL is a suitable match for Oracle NUMBER, it’s important to know the key differences between these INT, BIGINT, DOUBLE PRECISION, and NUMERIC.
BLOBs
(Note that this paragraph is quite outdated: PostgreSQL now does not only have Large Objects as described below, but the type bytea that is a good replacement for BLOBs, except that it cannot be streamed - for that you still use Large Objects.)
Binary large object support in Postgres is very poor and unsuitable for use in a 24/7 environment, because you can't dump them with pg_dump. Backing up a database that makes use of Postgres large objects requires one to knock down the RDBMS and dump the files in the database directory.
Don Baccus put together a hack that extends AOLserver's postgres driver with BLOB-like support, by uuencoding/decoding binary files before stuffing them into or extracting them from the database. The resulting objects can be consistently dumped by "pg_dump" while the RDBMS is up and running. There is no need to interrupt service while making your backup.
To get around the one-block limit on the size of a tuple imposed by Postgres, the driver segments the encoded data into 8K chunks.
Postgres large objects are scheduled for a major overhaul in summer 2000. Because of this, only the BLOB functionality used by the ACS was implemented.
To use the BLOB driver extension, you must first create a column of type "integer" with the name "lob" in the table that will store the BLOB, and a trigger on it that calls "on_lob_ref". You must use the name "lob". Here's an example:
create trigger my_table_lob_trig before insert or delete or update on my_table for each row execute procedure on_lob_ref();
To put a binary file into "my_table":
Note that the call to ns_pg to stuff the file into the database MUST be wrapped in a transaction, even if you're not updating any other tables at the same time. The driver will return an error if you don't.
To return a large object stored in "my_table" to the user:
Note that you don't need to wrap the call to blob_write in a transaction, as the database isn't being modified.
The large objects are automatically deleted when no longer used. To replace the large object stored in an existing record, just allocate a new one by calling "empty_lob()" and assign the returned key to the "lob" column in your table.
A couple of extension tools are available for Oracle migration.
Ispirer MnMTK Ispirer Migration and Modernization Toolkit (Ispirer MnMTK) automatically migrates the entire database schema (tables, views, packages, stored procedures, functions, triggers, etc.) and transfers data from Oracle to PostgreSQL. Full Convert Database conversion between Oracle (and 40+ other database engines) and PostgreSQL. Uses up to 16 parallel workers for throughput of over a million records per second. Oracle to Postgres data migration and sync Software is able to convert 1 Million of records in 4-5 minutes. Trigger-based database sync method and simultaneous Bi-directional synchronization help you to manage your data easy and efficiently. ESF Database Migration Toolkit A toolkit migrates Oracle databae to PostgreSQL in wizard. It connects to Oracle and PostgreSQL database directly, and migrate its table structure, data, indexes, primary keys, foreign keys, comments and so on. orafce The goal of the project is to implement some functions from Oracle database. Some date functions (next_day, last_day, trunc, round, . ), string functions and some modules (DBMS_ALERT, DBMS_OUTPUT, UTL_FILE, DBMS_PIPE, . ) are implemented now. Functionality was verified on Oracle 10g and module is useful for production work. ora2pg Ora2Pg is a Perl module to export an Oracle database schema to a PostgreSQL compatible schema. It connects your Oracle database, extracts its structure, and generates an SQL script that you can load into your PostgreSQL database. Oracle to Postgres Program to migrate Oracle databases to PostgreSQL server. It does not use ODBC or any other middleware software. Table structures, data, indexes, primary keys and foreign keys are converted. Command line support allows to script, automate and schedule the conversion process. ora_migrator PostgreSQL extension in PL/pgSQL providing a database migration toolkit that makes use of the Oracle Foreign Data Wrapper.
Во время миграции из Oracle в PostgreSQL с помощью ora2pg встал вопрос с несоответствием типов данных между разными базами. По умолчанию не все колонки конвертируется правильно, а отсутствие в Oracle Boolean и вовсе порождает неоднозначность – часть колонок нужно перенести как числа, часть как логические значения. В тоже время hibernate знает все о типах данных и может создать эталонную схему.
Итоговый процесс переноса выглядел следующим образом: создание структуры таблиц через ora2pg, исправление структуры по эталонной схеме, перенос данных, конвертация blob и Boolean, добавление отсутствующих в PostgreSQL функций (nvl, nvl2, regexp_substr), создания оставшейся структуры — индексов, view и прочего.
Под катом накопившиеся за время конвертации sql скрипты для полуавтоматической миграции.
Обрабатываем oid(bytea)
Создаем процедуру для конвертации bytea в oid
Создаем временную колонку
Удаляем старую колонку
Переименовываем временную колонку
Empty strings and NULL values
In Oracle empty strings ('') and NULL values in string context are the same till the latest versions. You can concatenate NULL result with a string and obtain this string as result. In PostgreSQL you will obtain null in this case. In Oracle you need to use IS NULL operator to check whether string is empty. In PostgreSQL you will obtain FALSE for empty string and TRUE for NULL.
On conversion from Oracle to PostgreSQL you will need to carefully analyze all your string handling code and separate NULLs and empty strings.
Oracle NUMBER and PostgreSQL NUMERIC
In Oracle, the NUMBER data type is defined as NUMBER(precision, scale) and in PostgreSQL, NUMERIC is defined as NUMERIC(precision, scale), with precision and scale defined as follows:
- Precision – Total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point
- Scale – Count of decimal digits in the fractional part, to the right of the decimal point
Analysis method: Actual data-based data type conversion
In Oracle, the NUMBER data type is often defined with no scale, and those columns are used for storing integer value only. But because scale isn’t defined, the metadata doesn’t show if these columns store only integer values. In this case, we need to perform a full table scan for those columns to identify if the columns can be converted to BIGINT or INT in PostgreSQL.
A NUMERIC column may be defined with DATA_PRECISION higher than 18, but all column values fit well in the BIGINT or INT range of PostgreSQL.
You can use the following SQL code in an Oracle database to find if the actual DATA_PRECISION and DATA_SCALE is in use. This code performs a full table scan, so you need to do it in batches for some tables, and if possible run it during low peak hours in an active standby database (ADG):
Based on the result of the preceding code, use the following transform rules:
- If MAX_DATA_PRECISION < 10 and MAX_DATA_SCALE = 0, convert to INT
- If DATA_PRECISION is 10–18 and MAX_DATA_SCALE = 0, convert to BIGINT
- if MAX_DATA_SCALE > 0 and MAX_DATA_PRECISION + MAX_DATA_SCALE
- if MAX_DATA_SCALE > 0 and MAX_DATA_PRECISION + MAX_DATA_SCALE > 15, convert to NUMERIC
Let’s look at the following example. In Oracle, create the table number_col_test with four columns defined and a couple of rows inserted:
We next run a query to look at the data to find the actual precision and scale. From the following example, can_be_int has precision as 6 and scale as 0 even though it’s defined as NUMBER(8,4). So we can define this column as INT in PostgreSQL.
In the following output, can_be_bigint has precision as 16 and scale as 0 even though it’s defined as NUMBER(19,2). So we can define this column as BIGINT in PostgreSQL.
In the following output, can_be_doubleprecision has precision as 8 and scale as 2 even though it’s defined as NUMBER(15,2). We can define this column as DOUBLE PRECISION in PostgreSQL.
In the following output, should_be_numeric has precision as 10 and scale as 9 even though it’s defined as NUMBER(20,10). We can define this column as NUMERIC in PostgreSQL.
Исправление типов колонок
Смотрим в каких колонках надо произвести изменения
Замены для простых случаев
NO_DATA_FOUND and TOO_MANY_ROWS
These exceptions are disabled by default for selects in PLpgSQL. You need to add keyword STRICT after any keyword INTO in all selects, when you need to keep single row checking in stored PLpgSQL code. See documentation.
Обрабатываем boolean
Для них требуется удалить если есть значение по умолчанию, поменять тип и вернуть значение по умолчанию
Subquery in FROM
PostgreSQL requires a sub-SELECT surrounded by parentheses, and an alias must be provided for it. The alias is not mandatory for Oracle.
a query for Oracle:
in PostgreSQL will look like:
Postgres doesn't have all the functionality of Oracle. ACS/pg is forced to deal with these limitations with specific work-arounds. Almost everything can be done under Postgres, but some features are awaiting new versions of the open-source database.
Comparing numeric types in PostgreSQL
One of the important differences between INT and BIGINT vs. NUMERIC is storage format. INT and BIGINT have a fixed length: INT has 4 bytes, and BIGINT has 8 bytes. However, the NUMERIC type is of variable length and stores 0–255 bytes as needed. So NUMERIC needs more space to store the data. Let’s look at the following example code (we’re using PostgreSQL 12), which shows the size differences of tables with integer columns and NUMERIC:
From the preceding output, INT occupies 3,515 bytes and NUMERIC occupies 6,152 bytes for the same set of values. However, you see the same table size because PostgreSQL is designed such that its own internal natural alignment is 8 bytes, meaning consecutive fixed-length columns of differing size must be padded with empty bytes in some cases. We can see that with the following code:
From the preceding output, it’s clear that an empty PostgreSQL row requires 24 bytes of various header elements, a SMALLINT is 2 bytes, and a BIGINT is 8 bytes. However, combining a SMALLINT and BIGINT takes 16 bytes. This is because PostgreSQL is padding the smaller column to match the size of the following column for alignment purposes. Instead of 2 + 8 = 10, the size becomes 8 + 8 = 16.
Integer and floating point values representation and their arithmetic operations are implemented differently. In general, floating point arithmetic is more complex than integer arithmetic. It requires more time for calculations on numeric or floating point values as compared to the integer types.
PostgreSQL BIGINT and INT data types store their own range of values. The BIGINT range (8 bytes) is -9223372036854775808 to 9223372036854775807 :
The INT range (4 bytes) is -2147483648 to 2147483647 :
For information about these data types, see Numeric Types.
With the NUMBER data type having precision and scale in Oracle, it can be either DOUBLE PRECISION or NUMERIC in PostgreSQL. One of the main differences between DOUBLE PRECISION and NUMERIC is storage format and total length of precision or scale supported. DOUBLE PRECISION is fixed at 8 bytes of storage and 15-decimal digit precision.
DOUBLE PRECISION offers a similar benefit as compared to the NUMERIC data type as explained for BIGINT and INT in PostgreSQL.
With DOUBLE PRECISION, because its limit is 15 decimals, for any data with higher precision or scale we might be affected by data truncation during data migration from Oracle. See the following code:
When considering data type in PostgreSQL when the NUMBER data type has decimal information, we should check for max precision along with max scale and decide accordingly if the target data type is either DOUBLE PRECISION or NUMERIC.
If you’re planning to store values that require a certain precision or arithmetic accuracy, the DOUBLE PRECISION data type may be the right choice for your needs. For example, if you try to store the result of 2/3, there is some rounding when the 15th digit is reached when you use DOUBLE PRECISION. It’s used for not only rounding the value based on precision limit, but also for the arithmetic accuracy. If you consider the following example, a double precision gives the wrong answer.
Starting with PostgreSQL 12, performance has been improved by using a new algorithm for output of real and double precision values. In previous versions (older than PostgreSQL 12), displayed floating point values were rounded to 6 (for real) or 15 (for double precision) digits by default, adjusted by the value of the parameter extra_float_digits . Now, whenever extra_float_digits is more than zero (as it is by default from PostgreSQL 12 and newer), only the minimum number of digits required to preserve the exact binary value are output. The behavior is the same as before when extra_float_digits is set to zero or less.
Decode
Oracle's handy decode function works as follows:
To evaluate this expression, Oracle compares expr to each search value one by one. If expr is equal to a search, Oracle returns the corresponding result. If no match is found, Oracle returns default, or, if default is omitted, returns null.
Postgres doesn't have the same construct. It can be replicated with:
which returns the expression corresponding to the first true predicate. For example:
There is one tiny discrepancy between DECODE and its emulation with CASE:
will return 'null' in case x is NULL, while
results in 'else'. This also applies within Oracle.
Oracle has another handy function: NVL. NVL returns its first argument if it is not null, otherwise it returns its second argument.
The above statement will return SYSDATE if hire_date is null. Postgres (and Oracle) has a function that performs the same thing in a more generalized way: coalesce(expr1, expr2, expr3. ) returns the first non-null expression that is passed to it.
Data Types
Postgres sticks mostly to SQL standards while Oracle in history chooses its own way - especially concerning data types.
Переносим данные
Перед переносом данных проверяем, что не совпадения типов остались только в колонках boolean и bytea(oid), а все колонки с численными значениями имеют правильную точность и не являются ‘double precision’ во избежание случайного округления.
По умолчанию процесс копирования идет в один поток. Меняем в конфигурации на нужное число
Запускаем процесс копирования данных
Читайте также: