Oracle replication очистить очередь

Любопытное наблюдение: когда говоришь о репликации Oracle, тебе сразу говорят «да знаем мы про этот GoldenGate!». Конечно, знаете, ведь это решение предлагает непосредственный вендор объекта репликации. При этом за кадром остаются не менее интересные решения. Вендор Quest нам говорит, что в России о Shareplex больше известно в компаниях с иностранным участием, чем в исконно русских. Всё потому, что коллеги из за рубежа рассказали своим российским коллегам о том, что есть такой Shareplex и он очень даже хорош для репликации, а где-то даже превосходит решение от Oracle.

Начиная с Oracle 19c, RAC можно использовать только в версии Enterprise Edition (EE). Если вы решили обновиться до 19c, а у вас одна из предыдущих версий СУБД Oracle Standard Edition работает в режиме RAC, остается несколько вариантов:

Переход с SE на EE;
Переезд в облако;
Конвертация Oracle RAC в Standalone;
Использование SE с HA;
Репликация.

Другой кейс использования репликации — это т.н. ZeroImpact интеграция Oracle с внешними системами посредством чтения Redo/Archive логов. Например, в одном из банков был кейс репликации Oracle в Kafka для того, чтобы оперативно предлагать банковские продукты клиентам, оставившим свои данные на сайте.

Shareplex также поддерживает репликацию в PostgreSQL, что в итоге приводит к снижению стоимости владения СУБД. Кстати, это один из популярных кейсов использования инструмента для репликации.

В этой статье я познакомлю вас с возможностями Shareplex, расскажу о его преимуществах и предложу пилотный проект. Погнали!

Shareplex поддерживает репликацию из Oracle (включая ASM, RAC, Exadata) в другой Oracle, в Kafka, PostgreSQL, SQL Server и в JSON-файлы. Для старта репликации нужна установка агента на источник и приемник (в некоторых случаях, например, с Kafka, агент ставится на отдельный сервер). Shareplex позволяет передавать данные как из одного источника в несколько приемников так и из нескольких источников в один приемник. Shareplex не использует для своей работы расширенный функционал СУБД Oracle: Logminer или тот же Streams, поэтому не требует дополнительных расходов на лицензирование.

В стандартную поставку Shareplex также входит инструмент для сравнения источника с приемником. В случае обнаружения различий, будет сгенерирован SQL-запрос, выполнив который, источник и приемник придут в соответствие друг другу.

Теперь перейдем к сравнению Shareplex и GoldenGate. Ключевые отличия первого от второго — это цена и контроль консистентности передаваемых данных. Shareplex значительно дешевле аналога от Oracle и имеет встроенную возможность контроля целостности данных.

Дополнительные вопросы относительно Shareplex вы можете задать через форму обратной связи на нашем сайте или другим удобным способом. Решение доступно в триальной версии, вы можете его попробовать, оценить возможности или сравнить с вашим текущим средством для репликации Oracle. Пилотные проекты и референсные встречи мы также проводим, пожалуйста, обращайтесь.

This chapter illustrates how to manage the replication objects and queues in your replication environment using the replication management API.

This chapter contains these topics:

Managing the Deferred Transactions Queue

Typically, Advanced Replication is configured to push and purge the deferred transaction queue automatically. At times, however, you might need to push or purge the deferred transaction queue manually. The process for pushing the deferred transaction queue is the same at master sites and materialized view sites.

Purging the Deferred Transaction Queue

If your system is not set to automatically purge the successfully propagated transactions in your deferred transaction queue periodically, then you must complete the following steps to purge them manually.

This example illustrates purging the deferred transaction queue at a materialized view site, but the process is the same at master sites and materialized view sites.

Executed As : Materialized View Administrator

Executed At : Materialized View Site

Complete the following steps:

Step 1 Connect to the materialized view site as the materialized view administrator.

Step 2 Purge the deferred transaction queue.

If you use the purge_method_quick parameter, deferred transactions and deferred procedure calls that have been successfully pushed can remain in the DEFTRAN and DEFCALL data dictionary views for longer than expected before they are purged. See the "Usage Notes" for DBMS_DEFER_SYS.PURGE for details.

Modifying Tables without Replicating the Modifications

You might have a situation in which you need to modify a replicated object, but you do not want this modification replicated to the other sites in the replication environment. For example, you might want to disable replication in the following situations:

When you are using procedural replication to propagate a change, always disable row-level replication at the start of your procedure.

You might need to disable replication in triggers defined on replicated tables to avoid replicating trigger actions multiple times. See "Ensuring that Replicated Triggers Fire Only Once".

Sometimes when you manually resolve a conflict, you might not want to replicate this modification to the other copies of the table.

You might need to do this, for example, if you need to correct the state of a record at one site so that a conflicting replicated update will succeed when you reexecute the error transaction. Or, you might use an unreplicated modification to undo the effects of a transaction at its origin site because the transaction could not be applied at the destination site. In this example, you can use the Replication Management tool to delete the conflicting transaction from the destination site.

To modify tables without replicating the modifications, use the REPLICATION_ON and REPLICATION_OFF procedures in the DBMS_REPUTIL package. These procedures take no arguments and are used as flags by the generated replication triggers.

To enable and disable replication, you must have the EXECUTE privilege on the DBMS_REPUTIL package.

Pushing the Deferred Transaction Queue

Master sites are configured to push the deferred transaction queue automatically at set intervals. At materialized view sites, if you do not automatically propagate the transactions in your deferred transaction queue during the refresh of your materialized view, then you must complete the following steps to propagate changes made to the updatable materialized view to its master table or master materialized view.

This example illustrates pushing the deferred transaction queue at a materialized view site, but the process is the same at master sites and materialized view sites.

Executed As : Materialized View Administrator

Executed At : Materialized View Site

Complete the following steps:

Step 1 Connect to the materialized view site as the materialized view administrator.

Step 2 Execute the following SELECT statement to view the deferred transactions and their destinations.

Propagation of the deferred transaction queue is based on the destination of the transaction. Each distinct destination and the number of transactions pending for the destination will be displayed.

Step 3 Execute the DBMS_DEFER_SYS.PUSH function for each site that is listed as a destination for a deferred transaction.

Run the PUSH procedure for each destination that was returned in the SELECT statement you ran in Step 2.

Using the RECTIFY Procedure

The RECTIFY procedure uses the information generated by the DIFFERENCES procedure to rectify the two tables. Any rows found in the first table and not in the second are inserted into the second table. Any rows found in the second table and not in the first are deleted from the second table.

To restore equivalency between all copies of a replicated table, complete the following steps:

Step 1 Select one copy of the table to be the "reference" table.

This copy will be used to update all other replicas of the table as needed.

Step 2 Determine if it is necessary to check all rows and columns in the table for differences, or only a subset.

For example, it might not be necessary to check rows that have not been updated since the last time that you checked for differences. Although it is not necessary to check all columns, your column list must include all columns that make up the primary key (or that you designated as a substitute identity key) for the table.

Step 3 After determining which columns you will be checking in the table, create two tables to hold the results of the comparison.

You must create one table that can hold the data for the columns being compared. For example, if you decide to compare the employee_id , salary , and department_id columns of the employees table, then your CREATE statement would need to be similar to the following:

You must also create a table that indicates where the row is found. This table must contain three columns with the datatypes shown in the following example:

Step 4 Suspend replication activity for the replication group containing the tables that you want to compare.

Although suspending replication activity for the group is not a requirement, rectifying tables that were not quiesced first can result in inconsistencies in your data.

Step 5 At the site containing the "reference" table, call the DIFFERENCES procedure.

For example, if you wanted to compare the employees tables at the New York and San Francisco sites, then your procedure call would look similar to the following:

Figure 9-1 shows an example of two replicas of the employees table and what the resulting missing rows tables would look like if you executed the DIFFERENCES procedure on these replicas.

Figure 9-1 Determining Differences Between Replicas

Notice that the two missing rows tables are related by the ROWID and r_id columns.

Step 6 Rectify the table at the "comparison" site to be equivalent to the table at the "reference" site.

The RECTIFY procedure temporarily disables replication at the "comparison" site while it performs the necessary insertions and deletions, as you would not want to propagate these changes. RECTIFY first performs all of the necessary DELETE operations and then performs all of the INSERT operations. This ensures that there are no violations of a PRIMARY KEY constraint.

After you have successfully executed the RECTIFY procedure, your missing rows tables should be empty.

If you have any additional constraints on the "comparison" table, then you must ensure that they are not violated when you call RECTIFY . You might need to update the table directly using the information in the missing rows table. If so, then be sure to DELETE the appropriate rows from the missing rows tables.

Step 7 Repeat Steps 5 and 6 for the remaining copies of the replicated table.

Remember to use the same "reference" table each time to ensure that all copies are identical when you complete this procedure.

Step 8 Resume replication activity for the master group.

Determining Differences Between Replicated Tables

It is possible for the differences to arise in replicated tables. When administering a replication environment, you might want to check, periodically, whether the contents of two replicated tables are identical. The following procedures in the DBMS_RECTIFIER_DIFF package let you identify, and optionally rectify, the differences between two tables.

Converting a LONG Column to a LOB Column in a Replicated Table

LOB columns can be replicated, but LONG columns cannot be replicated. You can convert the datatype of a LONG column to a CLOB column and the datatype of a LONG_RAW column to a BLOB column.

Converting a LONG column to a LOB column can result in increased network bandwidth requirements because the data in such a column is replicated after conversion. Make sure you have adequate network bandwidth before completing the procedure in this section.

Oracle Database Application Developer's Guide - Large Objects for more information about applications and LONG to LOB conversion

Complete the following steps to convert a LONG column to a LOB column in a replicated table:

Step 1 Make sure the data in the LONG column is consistent at all replication sites.

If a table containing a LONG column is configured as a master table, then Oracle does not replicate changes to the data in the LONG column. Therefore, the data in the LONG column might not match at all of your replication sites. You must make sure the data in the LONG column matches at all master sites before proceeding.

Step 2 Connect to the master definition site as the replication administrator.

Step 3 If the replication status is normal, then change the status to quiesced.

Step 4 Convert the LONG column to a LOB column.

A LONG_RAW column can be converted to a BLOB column using a similar ALTER TABLE statement.

Step 5 Regenerate replication support for the altered master table.

Step 6 Resume replication.

Step 7 If materialized views are based on the altered table at any of the master sites, then rebuild these materialized views.

Rebuild materialized views if necessary.

Disabling Replication

The DBMS_REPUTIL.REPLICATION_OFF procedure sets the state of an internal replication variable for the current session to false . Because all replicated triggers check the state of this variable before queuing any transactions, modifications made to the replicated tables that use row-level replication do not result in any queued deferred transactions.

Turning replication on or off affects only the current session. That is, other users currently connected to the same server are not restricted from placing committed changes in the deferred transaction queue.

If you are using procedural replication, then call REPLICATION_OFF at the start of your procedure, as shown in the following example. This ensures that the replication facility does not attempt to use row-level replication to propagate the changes that you make.

Altering a Replicated Object in a Quiesced Master Group

Use the ALTER_MASTER_REPOBJECT procedure in the DBMS_REPCAT package to alter the characteristics of your replicated objects in a quiesced master group. From the example following, notice that you simply include the necessary DDL within the procedure call (see the ddl_text parameter).

If any master site is lower than 9.0.1 compatibility level, then you must use the following procedure. That is, the master group must be quiesced to modify a replicated object. You control the compatibility level of a database with the COMPATIBLE initialization parameter.

Meet the following requirements to complete these actions:

Executed As : Replication Administrator

Executed At : Master Definition Site

Replication Status : Quiesced

Complete the following steps to alter a replicated object in a quiesced master group.

If your master site is running Oracle8 i Database release 8.1.7 or later in a single master environment and you are making a safe change to a replicated object, then you might not need to quiesce the master group. See the "ALTER_MASTER_REPOBJECT Procedure" for information about when quiesce is not required.

If you are viewing this document online, then you can copy the text from the "BEGINNING OF SCRIPT" line after this note to the "END OF SCRIPT" line into a text editor and then edit the text to create a script for your environment.

Step 1 Connect to the master definition site as the replication administrator.

Step 2 If necessary, then quiesce the master group.

See the "ALTER_MASTER_REPOBJECT Procedure" for information about when quiesce is not required.

Step 3 In a separate SQL*Plus session, check the status of the master group you are quiescing.

Do not proceed until the group's status is QUIESCED .

To check the status, run the following query:

Step 4 Alter the replicated object.

Step 5 Regenerate replication support for the altered object.

Step 6 In a separate SQL*Plus session, check if DBA_REPCATLOG is empty.

Do not proceed until this view is empty.

Execute the following SELECT statement in another SQL*Plus session to monitor the DBA_REPCATLOG view:

Step 7 Resume replication activity.

Ensuring that Replicated Triggers Fire Only Once

If you have defined a replicated trigger on a replicated table, then you might need to ensure that the trigger fires only once for each change that you make. Typically, you only want the trigger to fire when the change is first made, and you do not want the remote trigger to fire when the change is replicated to the remote site.

You should check the value of the DBMS_REPUTIL.FROM_REMOTE package variable at the start of your trigger. The trigger should update the table only if the value of this variable is false .

Alternatively, you can disable replication at the start of the trigger and reenable it at the end of the trigger when modifying rows other than the one that caused the trigger to fire. Using this method, only the original change is replicated to the remote sites. Then the replicated trigger fires at each remote site. Any updates performed by the replicated trigger are not pushed to any other sites.

Using this approach, conflict resolution is not invoked. Therefore, you must ensure that the changes resulting from the trigger do not affect the consistency of the data.

Reexecuting Error Transaction as Alternate User

The following procedure reexecutes a specified deferred transaction in the security context of the currently connected user. This procedure should not be executed until the error situation has been resolved.

Splitting and merging an Oracle Streams destination is useful under the following conditions:

A single capture process captures changes that are sent to two or more apply processes.

An apply process stops accepting changes captured by the capture process. The apply process might stop accepting changes if, for example, the apply process is disabled, the database that contains the apply process goes down, there is a network problem, the computer system running the database that contains the apply process goes down, or for some other reason.

When these conditions are met, it is best to split the problem destination off from the other destinations. The reason to split the destination off depends on whether the configuration uses the combined capture and apply optimization:

If the apply process at the problem destination is part of a combined capture and apply optimization and the destination is not split off, then performance will suffer when the destination becomes available again. In this case, the capture process must capture the changes that must now be applied at the destination previously split off. The other destinations will not receive more recent changes until the problem destination has caught up. However, if the problem destination is split off, then it can catch up to the other destinations independently, without affecting the other destinations.

If the apply process at the destination is not part of a combined capture and apply optimization, then captured changes that cannot be sent to the problem destination queue remain in the source queue, causing the source queue size to increase. Eventually, the source queue will spill captured logical change records (LCRs) to hard disk, and the performance of the Oracle Streams replication environment will suffer.

Split and merge operations are possible in the following types of Oracle Streams replication environments:

Changes captured by a single capture process are sent to multiple remote destinations using propagations and are applied by apply processes at the remote destinations.

Changes captured by a single capture process are applied locally by multiple apply processes on the same database that is running the capture process.

Changes captured by a single capture process are sent to one or more remote destinations using propagations and are applied locally by one or more apply processes on the same database that is running the capture process.

For environment with local capture and apply, split and merge operations are possible when the capture process and apply processes share the same queue, and when a propagation sends changes from the capture process's queue to an apply process's queue within the one database.

Figure 12-1 shows an Oracle Streams replication environment that uses propagations to send changes to multiple destinations. In this example, destination database A is down.

Figure 12-1 Problem Destination in an Oracle Streams Replication Environment

You can use the following data dictionary views to determine when there is a problem with a stream:

Query the V$BUFFERED_QUEUES view to identify how many messages are in a buffered queue and how many of these messages have spilled to hard disk.

When propagations are used, query the DBA_PROPAGATION and V$PROPAGATION_SENDER views to show the propagations in a database and the status of each propagation

To avoid degraded performance in this situation, split the stream that flows to the problem database off from the other streams flowing from the capture process. When the problem is corrected, merge the stream back into the other streams flowing from the capture process.

You can configure capture process parameters to split and merge a problem stream automatically, or you can split and merge a problem stream manually. Either way, the SPLIT_STREAMS , MERGE_STREAMS_JOB , and MERGE_STREAMS procedures in the DBMS_STREAMS_ADM package are used. The SPLIT_STREAMS procedure splits off the stream for the problem destination from all of the other streams flowing from a capture process to other destinations. The SPLIT_STREAMS procedure always clones the capture process and the queue. The SPLIT_STREAMS procedure also clones the propagation in an environment that sends changes to remote destination databases. The cloned versions of these components are used by the stream that is split off. While the problem stream is split off, the streams to other destinations proceed as usual.

Figure 12-2 shows the cloned stream created by the SPLIT_STREAMS procedure.

Figure 12-2 Splitting Oracle Streams

When the problem destination becomes available again, the cloned stream begins to send captured changes to the destination database again.

Figure 12-3 shows a destination database A that is up and running and a cloned capture process that is enabled at the capture database. The cloned stream begins to flow and starts to catch up to the original streams.

Figure 12-3 Cloned Stream Begins Flowing and Starts to Catch Up to One Original Stream

When the cloned stream catches up to one of the original streams, one of the following procedures merges the streams:

The MERGE_STREAMS procedure merges the stream that was split off back into the other streams flowing from the original capture process.

The MERGE_STREAMS_JOB procedure determines whether the streams are within the user-specified merge threshold. If they are, then the MERGE_STREAMS_JOB procedure runs the MERGE_STREAMS procedure. If the streams are not within the merge threshold, then the MERGE_STREAMS_JOB procedure does nothing.

Typically, it is best to run the MERGE_STREAMS_JOB procedure instead of running the MERGE_STREAMS procedure directly, because the MERGE_STREAMS_JOB procedure automatically determines whether the streams are ready to merge before merging them.

Figure 12-4 shows the results of running the MERGE_STREAMS procedure. The Oracle Streams replication environment has its original components, and all of the streams are flowing normally.

Figure 12-4 Merging Oracle Streams

Oracle Streams Concepts and Administration for information about combined capture and apply

12.3.2 Split and Merge Options

The following split and merge options are available:

12.3.2.1 Automatic Split and Merge

You can set two capture process parameters, split_threshold and merge_theshold , so that Oracle Streams performs split and merge operations automatically. When these parameters are set to specify automatic split and merge, an Oracle Scheduler job monitors the streams flowing from the capture process. When an Oracle Scheduler job identifies a problem with a stream, the job splits the problem stream off from the other streams flowing from the capture process. When a split operation is complete, a new Oracle Scheduler merge job monitors the split stream. When the problem is corrected, this job merges the stream back with the other streams.

When the split_threshold capture process parameter is set to INFINITE , automatic splitting is disabled. When the split_threshold parameter is not set to INFINITE , automatic splitting is enabled. Automatic splitting only occurs when communication with an apply process has been lost for the number of seconds specified in the split_threshold parameter. For example, communication with an apply process is lost when an apply process becomes disabled or a destination database goes down. Automatic splitting does not occur when one stream is processing changes slower than other streams.

When a stream is split, a cloned capture process is created. The cloned capture process might be enabled or disabled after the split depending on whether the configuration uses the combined capture and apply optimization:

If the apply process is part of a combined capture and apply optimization, then the cloned capture process is enabled. The cloned capture process does not capture any changes until the apply process is enabled and communication is established with the apply process.

If the apply process is not part of a combined capture and apply optimization, then the cloned capture process is disabled so that LCRs do not build up in a queue. When the apply process is enabled and the cloned stream can flow, you can start the cloned capture process manually.

The split stream is merged back with the original streams automatically when the difference, in seconds, between CAPTURE_MESSAGE_CREATE_TIME in the GV$STREAMS_CAPTURE view of the cloned capture process and the original capture process is less than or equal to the value specified for the merge_threshold capture process parameter. The CAPTURE_MESSAGE_CREATE_TIME records the time when a captured change was recorded in the redo log. If the difference is greater than the value specified by this capture process parameter, then automatic merge does not begin, and the value is recorded in the LAG column of the DBA_STREAMS_SPLIT_MERGE view.

When the capture process and the apply process for a stream run in different database instances, automatic split and merge is always possible for the stream. When a capture process and apply process for a stream run on the same database instance, automatic split and merge is possible only when all of the following conditions are met:

В настоящее время при построении многих автоматизированных систем возникает проблема синхронизации данных по нескольким источникам информации. Один из способов решения этой проблемы — репликации.

В данном топике я расскажу об одной из таких проблем и о том, как можно решить эту проблему с помощью технологии Oracle Streams.

Абстрактно

Проблема синхронизации данных по нескольким источникам информации представляет собой довольно нетривиальную задачу с весьма неоднозначным решением.
Подобные проблемы возникают довольно часто, но универсального решения такой задачи на текущий момент практически нет. Почти все готовые системы репликации данных работают с существенными ограничениями по структуре и способам накопления и изменения данных.

Введение в проблему

В настоящее время при синхронной репликации задержки отсутствуют, но это, в свою очередь, сказывается на пропускной способности транзакций и возможностях системы в целом.
Таким образом, нет ничего удивительного в том, что большинство пользователей выбирают асинхронную репликацию.
Задача состоит в том, чтобы сконструировать систему асинхронной репликации, которая сможет гарантировать малую фиксированную величину задержки и будет поддерживать полную пропускную способность транзакций базы данных.

Технология Oracle Streams

LCR, CR

В контексте Oracle Streams, информационное представление любого изменения, сделанного в базе данных, называется LCR (logical change record). LCR – это обобщенное представление всех возможных изменений, представленных в базе данных.
CR(change record) — запись изменения, используется для того, чтобы обозначить конкретное изменение в базе.

Rules and transformations

Также пользователь имеет возможность определять соответствия между LCR и набором правил. Эти правила оценивают все изменения, произведенные в базе данных, и проводят фильтрацию несоответствующих LCR.

Например, следующее правило определяет только DML изменения таблицы SCOTT.EMP

Точно также правила определяются и для DDL изменений.
Кроме того, к правилам могут быть привязаны трансформации. Трансформации используют пользовательские или системные хранимые процедуры и автоматически изменяют любой LCR , который удовлетворяет условиям используемого правила.

Queues

Очереди осуществляют хранение LCR, когда они двигаются в системе, т.е. находятся «между» процессами Oracle Streams.
Одна из первоочередных задач при настройке Oracle Streams — создать очереди и привязать их к процессам Oracle Streams. Для каждого процесса Oracle Streams может быть определен набор правил и связанных с этими правилами трансформаций для того, чтобы иметь возможность фильтровать информацию на «входе» и «выходе» процесса.
Очереди поддерживают три типа операций, enqueue — построение LCR в очередь, browse — просмотр LCR и dequeue – удаление.

Capture, Propagation and Apply

capture,
propagation,
apply.

считывание изменений, содержащихся в журналах транзакций,
преобразование CR в LCR,
постановка LCR в очередь.

просмотр LCR,
передача LCR из одной очереди в другую, причем очереди могут находится как на одной базе данных, так и на разных,
удаление LCR.

извлекает принятые LCR из очереди,
производит изменения с базой данных в соответствии с LCR,
удаляет LCR из очереди.

Overview

На рисунке изображена общая схема репликации на основе Oracle Streams:

В примере представлен вариант односторонней репликации, но возможны и другие варианты. К примеру, мы можем добавить другой набор процессов захвата, распространения и применения в противоположном направлении, чтобы получить двухстороннюю репликацию. Точно также, соединяя соответствующие процессы, можно формировать любую репликационную топологию.

Supplemental Logging

Как уже было сказано выше, в основе каждой LCR лежит CR, которая несет минимальное количество информации. Обычно, это возможные к извлечению измененные атрибуты и rowid.
Когда происходит изменение данных, т.е. DML изменение, LCR должен содержать первичные ключи изменяемых строк. Но возможен случай, когда принятые данные применяются в параллельных процессах, и тогда могут понадобиться и другие, не ключевые, колонки. Таким образом, CR может включать дополнительные, не ключевые, колонки. Это нужно для того, чтобы эти колонки оставались неизменными, а также, чтобы усилить проверку соответствия строк источника и приёмника. Добавление этих колонок в журналы транзакций называется supplemental logging.

Apply handler

При решении некоторых задач с помощью Oracle Streams удобно будет изменять принимаемую LCR «на лету» при помощи хранимой процедуры, написанной пользователем и называющейся apply handler.
Например, это необходимо когда репликации происходят между схемами с разными именами и, таким образом, LCR источника не может корректно примениться на приемнике. Отсюда следует, что задачей apply handler является преобразование изменений источника, представленного в виде LCR, так, чтобы они могли корректно применяться на приемнике.

Conflicts

У асинхронной репликации, так же как и у синхронной, есть свои недостатки. Основным недостатком асинхронной репликации является возможность несоответствия данных, также называемых конфликтами данных. Они возникают, когда пользователи делают изменения на приемнике, причем эти изменения конфликтуют с данными источника. Другими словами, после сделанных пользователем изменений данные источника и приёмника не соответствуют друг другу. Эти возможные несоответствия требуют анализа и решения. Такие конфликты чаще всего происходят при обновлении данных на источнике.

При анализе из LCR берутся «старые» данные источника, т.е. данные, которые были до изменения, сделанного конкретным LCR. Затем, во время применения, они сравниваются с текущими значениями в обновляемой строке на приёмнике.

Помимо пользовательских проверок соответствия база данных также отслеживает нарушения ссылочной целостности, уникальности и другие ограничения.

Кроме того, для анализа в Oracle Streams содержится встроенный конфликтный обработчик. Двумя основными «режимами» этого обработчика являются «максимальное значение» и «перезапись». В первом режиме при возникновении конфликта сравнивается старое и новое значения и записывается наибольшее, для строк большее значение выбирается с помощью ASCII кодов, во втором – всегда происходит запись нового значения. Также пользователи могут обрабатывать конфликты и сами, написав хранимую процедуру, которая будет решать возникающие конфликты.
Но практика показывает, что в большинстве случаев «режимы» встроенного обработчика подходят для требований большинства пользователей.

Заключение

В настоящей статье я попытался описать основные процессы и объекты Oracle Streams, которые на мой взгляд могут кому-нибудь интересны. В детали я не вдавался, а описал все поверхностно. Более подробно всегда можно прочитать в официальной документации. Главное знать что читать.

Я не человек базы данных, точно, и большая часть моей работы с БД была с MySQL, поэтому простите меня, если что-то в этом вопросе невероятно наивно.

мне нужно удалить 5.5 миллионов строк из таблицы Oracle, которая имеет около 100 миллионов строк. У меня есть все идентификаторы строк, которые мне нужно удалить во временной таблице. Если бы это было всего несколько тысяч строк, я бы сделал следующее:

есть ли что-то, что мне нужно знать, и / или делать по-другому, потому что это 5.5 миллион строк? Я подумал о том, чтобы сделать цикл, что-то вроде этого:

прежде всего - это делает то, что я думаю, что это-пакетные коммиты 200,000 за раз? Предполагая, что это так, я все еще не уверен, лучше ли генерировать 5,5 миллиона SQL-операторов и фиксировать партиями по 200 000 или иметь один SQL-оператор и фиксировать все сразу.

идеи? Передовая практика?

редактировать: я запустил первый вариант, оператор single delete, и он только потребовалось 2 часа, чтобы завершить разработку. Исходя из этого, он находится в очереди на запуск в производство.

первый подход лучше, потому что вы даете оптимизатору запросов четкое представление о том, что вы пытаетесь сделать, вместо того, чтобы пытаться скрыть его. Компонент database engine может использовать другой подход к внутреннему удалению 5.5 m (или 5.5% таблицы), чем к удалению 200k (или 0.2%).

вот также статьи о массовом удалении в Oracle, которое вы можете прочитать.

самый быстрый способ создать новый с CREATE TABLE AS SELECT используя . Я имею в виду:

конечно, вы должны воссоздать ограничения без проверки, индексы с nologging, гранты. но очень очень быстро.

если вы имеете тревогу в продукции, то вы можете сделать следующее:

вы должны позаботиться о:

хранимые процедуры могут быть признаны недействительными, но они будут перекомпилированы при втором вызове. Вы должны это проверить.
NOLOGGING означает, что минимальный создаются повторы. Если у вас есть роль DBA, запустите ALTER SYSTEM CHECKPOINT чтобы гарантировать отсутствие потерянных данных при сбое экземпляра.
на NOLOGGING табличное пространство должно быть также в NOLOGGING .

другой вариант лучше, чем создавать миллионы вставки:

UPDATE: почему я могу гарантировать, что последний блок PLSQL будет работать? Потому что я предполагаю, что:

никто другой не использует эту временную таблицу по какой-либо причине (dba или задания, собирающие статистику, задачи dab, такие как перемещение, вставка записей и т. д.). Это можно обеспечить потому что вспомогательная таблица только для этот.
затем, с последним утверждением, запрос будет выполнен ровно С тем же планом и собирается вернуть строки с тем же порядком.

при выполнении массовых удалений в Oracle , убедитесь, что у вас не заканчивается UNDO SEGMENTS .

при выполнении DML , Oracle сначала записывает все изменения в REDO log (старые данные вместе с новыми данными).

когда REDO журнал заполняется или происходит тайм-аут, Oracle выполняет log synchronization : он пишет new data в файлы данных (в вашем случае отмечает блоки файлов данных как свободные) и записывает старые данные в UNDO табличное пространство (так что он остается видимым для параллельных транзакций, пока вы commit ваши изменения).

когда вы фиксируете свои изменения, пробел в UNDO сегменты, занятые транзакцией yuor, освобождаются.

это означает, что если удалить 5M строки данных, вам нужно будет иметь место для all эти строки в свой UNDO сегменты, чтобы данные могли быть перемещены туда первыми ( all at once ) и удаляется только после фиксации.

это также означает, что параллельные запросы (если таковые имеются) необходимо будет прочитать из REDO журналы или UNDO сегментов при выполнении сканирования таблицы. Это не самый быстрый способ доступа к данным.

это также означает, что если оптимизатор будет выбирать HASH JOIN для вашего запроса на удаление (что он, скорее всего, сделает), и временная таблица не будет вписываться в HASH_AREA_SIZE (что, скорее всего, будет так), тогда запрос будет нужен several просматривает над большой таблицей, и некоторые из частей таблицы будут уже переехал в REDO или UNDO .

учитывая все сказанное выше, вам, вероятно, лучше удалить данные в 200,000 куски и зафиксировать изменения между ними.

таким образом, Вы, во-первых, избавиться от проблем, описанных выше, и, во-вторых, оптимизировать свой HASH_JOIN , поскольку у вас будет такое же количество чтений, но сами чтения будут более эффективными.

в вашем случае, однако, я бы попытался заставить оптимизатор использовать NESTED LOOPS , as Думаю, в твоем случае это будет быстрее.

чтобы сделать это, убедитесь, что ваша временная таблица имеет первичный ключ ID , и перепишите свой запрос следующим образом:

вам нужно будет иметь первичный ключ на temp_table для работы этого запроса.

сравните его со следующим:

, посмотрите, что быстрее и придерживайтесь этого.

лучше делать все сразу, как в первом примере. Но я определенно пройдусь по нему с вашим DBA, так как они могут захотеть вернуть блоки, которые вы больше не используете после чистки. Кроме того, могут быть проблемы планирования, которые обычно не видны с точки зрения пользователя.

если исходный SQL занимает очень много времени, некоторые параллельные SQL могут работать медленно, так как они должны использовать UNDO для восстановления версии данных без незафиксированных изменений.

компромисс может быть что-то вроде

вы можете настроить ROWNUM по мере необходимости. Меньший ROWNUM означает более частые коммиты и (возможно) снижение влияния на другие сеансы с точки зрения необходимости применения отмены. Однако, в зависимости от планов выполнения, могут быть и другие последствия, и это вероятно, потребуется больше времени в целом. Технически " FOR " часть цикла не нужна, так как выход завершит цикл. Но я параноик о неограниченных петлях, так как это боль, чтобы убить сеанс, если они застряли.

Я бы рекомендовал запустить это как одно удаление.

есть ли дочерние таблицы того, из которого вы удаляете? Если это так, убедитесь, что внешний ключ в этих таблицах индексируется. В противном случае вы можете выполнить полное сканирование дочерней таблицы для каждой удаленной строки, что может замедлить работу.

вам могут понадобиться некоторые способы проверить ход удаления по мере его выполнения. См.Как проверить базу данных oracle на длительный срок запросы?

Как предложили другие люди, если вы хотите проверить воду, вы можете поместить: rownum

Я сделал что-то подобное в прошлом с Oracle 7, где мне пришлось удалить миллионы строк из тысяч таблиц. По всем характеристикам и особенно больших удалений (миллион строк, плюс в одной таблице) этот скрипт работал хорошо.

вам нужно будет немного изменить его (т. е. изучить пользователей/пароли, а также получить правильные сегменты отката). Также вам действительно нужно обсудить это с вашим DBA и запустить его в тестовой среде. Сказав Все это, довольно легкий. Функция delete_sql() ищет пакет rowids в указанной таблице, а затем удаляет их пакет за пакетом. Например:

приведенный выше пример удаляет 500 записей одновременно из таблицы MSF170 на основе инструкции sql.

Если вам нужно удалить данные из нескольких таблиц, просто включают дополнительные exec delete_sql(. ) строки в файле delete-tables.в SQL

О и не забудьте поставить свои сегменты отката обратно в интернет, это не в сценарии.

Ну и последний совет. Это будет медленно и в зависимости от таблицы может потребоваться некоторое время простоя. Тестирование, синхронизация и настройка-ваш лучший друг здесь.

все ответы здесь отличные, только одна вещь, чтобы добавить: если вы хотите удалить все записей в таблице, а конечно вам не понадобится откат, тогда вы хотите использовать усечь таблицу.

(в вашем случае вы хотели удалить только подмножество, но для тех, кто скрывается с подобной проблемой, я думал, что добавлю это)

Managing the Error Queue

As an administrator of a replication environment, you should regularly monitor the error queue to determine if any deferred transactions were not successfully applied at the target master site.

To check the error queue, issue the following SELECT statement (as the replication administrator) when connected to the target master site:

If the error queue contains errors, then you should resolve the error condition and reexecute the deferred transaction. You have two options when reexecuting a deferred transaction: you can reexecute in the security context of the user who received the deferred transaction, or you can reexecute the deferred transaction with an alternate security context.

If you have multiple error transactions and you want to make sure they are reexecuted in the correct order, then you can specify NULL for the deferred_tran_id parameter in the procedures in the following sections. If you do not specify NULL , then reexecuting individual transactions in the wrong order can cause conflicts.

Reenabling Replication

After resolving any conflicts, or at the end of your replicated procedure, be certain to call DBMS_REPUTIL.REPLICATION_ON to resume normal replication of changes to your replicated tables or materialized views. This procedure takes no arguments. Calling REPLICATION_ON sets the internal replication variable to true .

Reexecuting Error Transaction as the Receiver

The following procedure reexecutes a specified deferred transaction in the security context of the user who received the deferred transaction. This procedure should not be executed until the error situation has been resolved.

Meet the following requirements to complete these actions:

Executed As : Replication Administrator

Executed At : Site Containing Errors

Replication Status : Normal

Complete the following steps:

Step 1 Connect to the master site as the replication administrator.

Step 2 Reexecute the error transaction.

Altering a Replicated Object

As your database needs change, you might need to modify the characteristics of your replicated objects. It is important that you do not directly execute DDL to alter your replicated objects. Doing so might cause your replication environment to fail.

If the logical structure of a master table is altered (for example, if a column name or type is changed), then all dependent materialized views must be rebuilt.

Using the DIFFERENCES Procedure

The DIFFERENCES procedure compares two replicas of a table, and determines all rows in the first replica that are not in the second and all rows in the second that are not in the first. The output of this procedure is stored in two user-created tables. The first table stores the values of the missing rows, and the second table is used to indicate which site contains each row.

Using the ANYDATA Type to Determine the Value of an Argument in a Deferred Call

If you are using column objects, collections, or REF s in a replicated table, then you can use the GET_ANYDATA_ARG function in the DBMS_DEFER_QUERY package to determine the value of an argument in a deferred call that involves one of these user-defined types.

The following example illustrates how to use the GET_ANYDATA_ARG function. This example uses the following user-defined types in the oe sample schema.

The following procedure retrieves the argument value for collection, object, and REF instances of calls stored in the deferred transactions queue. This procedure assumes that the call number and transaction id are available.

The user who creates the procedure must have EXECUTE privilege on the DBMS_DEFER_QUERY package and must have CREATE PROCEDURE privilege. This example uses the oe sample schema. Therefore, to run the example, you must grant the oe user these privileges.

Читайте также: