Email: service@parnassusdata.com 7 x 24 online support!

Oracle ORA-1157 Troubleshooting

Oracle ORA-1157 Troubleshooting

PURPOSE
This note is intended to list the common reasons and solutions for the ORA-1157 error.
 
SCOPE
NOTE: In the images and/or the document content below, the user information and environment data used represents fictitious data from the Oracle sample schema(s), Public Documentation delivered with an Oracle database product or other training material. Any similarity to actual environments, actual persons, living or dead, is purely coincidental and not intended in any manner.
 
 
This article is intended for Oracle Support and Oracle database administrators.
 
DETAILS
Oracle Error: ORA-1157
An ORA-01157 is issued whenever Oracle attempts to access a file but cannot find or lock the file.
Error Explanation:
 
01157, 00000, "cannot identify/lock data file %s - see DBWR trace file"
 
Cause: The background process was either unable to find one of the data files or failed to lock it because the file was already in use. The database will prohibit access to this file but other files will be unaffected. However the first instance to open the database will need to access all online data files. Accompanying error from the operating system describes why the file could not be identified.
 
Action: Have operating system make file available to database. Then either open the database or do ALTER SYSTEM CHECK DATAFILES.
ORA-01157 errors are usually followed by ORA-01110 and possibly an Oracle operating system layer error such as ORA-07360. A DBWR trace file is generated in the background_dump_dest directory.
For Example, on Solaris platform, the following errors will appear: 
 
ORA-01157: cannot identify/lock data file 19 - see DBWR trace file
ORA-01110: data file 19: '/<disk_path>/users02.dbf'
From the DBWR trace file:
 
ORA-01157: cannot identify/lock data file 19 - see DBWR trace file
ORA-01110: data file 19: '/<disk_path>/users02.dbf'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
 
Common Causes and Solutions for ORA-1157
 Note: Throughout this note we refer to "backups" but if you have a valid physical standby database
you may also use the standby database's datafiles to recover the primary database.
 
1.  The datafile does exist, but Oracle cannot find it.
 
The datafile may have been renamed at the operating system level, moved to a different directory or disk drive either intentionally or unintentionally.
 
In this case, restore and recover the datafile or move the datafile to its original name.
 
2.  The datafile does not exist or is unusable by Oracle. The datafile has been physically removed or damaged to an extent that Oracle cannot recognize it anymore.
 
For example, the datafile might be truncated or overwritten, in which case
ORA-27046 will accompany ORA-1157 error.
 
For example:
 
ORA-27046: file size is not a multiple of logical block size
 
In this case, the user has two options:
 
A   Recreate the tablespace that the datafile belongs to.
 
 
This option is best suited for USERS, INDEX, TEMPORARY tablespaces.
 
It is also recommended for UNDO tablespaces if the database had been SHUTDOWN CLEANLY, so that no active transactions are there in the rollback segments of this tablespace.
 
If the tablespace is SYSTEM tablespace, then this amounts to recreating or rebuilding the database.
 
This method is best suited for temporary tablespaces (since they do not contain important data), but can be used for USERS tablespaces and INDEXES tablespaces.
 
This method would be helpful wherein reasonably recent exports of the objects in the tablespace are available, or that the tables in the tablespace can be repopulated by running a script or program, loading the data through SQL*Loader, etc.
 
The steps involved are:
 
1. If the database is down, mount it.
 
STARTUP MOUNT;
 
2. Offline drop the datafile.
 
ALTER DATABASE DATAFILE 'full_path_file_name' OFFLINE DROP;
 
3. If the database is at mount, open it.
 
ALTER DATABASE OPEN;
 
4. Drop the user tablespace.
 
DROP TABLESPACE tablespace_name INCLUDING CONTENTS;
 
Note: The users can stop with this step if they do not want the
tablespace anymore in the database.
 
5. Recreate the tablespace.
 
CREATE TABLESPACE tablespace_name DATAFILE 'datafile_full_path_name' SIZE required_size;
 
6. Recreate all the previously existing objects in the tablespace.
 
This can be done using the creation scripts for the objects in that tablespace or using the recent export dump available for that tablespace objects.
 
 
B. Recover the datafile using normal recovery procedures.
 
This option is best suited for READ ONLY tablespaces and for USERS, INDEX tablespaces where recreating is not a feasible option.
 
If the tablespace is of type UNDO, then this is the method to be used if the database was not SHUTDOWN CLEANLY.
(that is, if shutdown abort had been used or the database had crashed)
 
If the tablespace is SYSTEM, then this is the recommended method, if there are backups and archivelogs are available. If the database is in
NOARCHIVELOG mode, then you can recover only if the required changes are present in the ONLINE redologs.
 
In many situations, recreating the user tablespace is impossible or too laborious. The solution then is to restore the lost datafile from a backup
and do media recovery on it. If the database is in NOARCHIVELOG mode, you will only succeed in recovering the datafile if the redo to be applied
to the datafile is within the range of the online logs.
 
This method would be ideal for READ ONLY tablespaces. If the tablespace was not switched to READ-WRITE after backup was taken and if the tablespace was
READ ONLY at the time of backup, then recovery is just restoring the backup of this tablespace.
 
These are the steps:
 
1. Restore the lost file from a backup.
 
2. If the database is down, mount it. 
 
STARTUP MOUNT;
 
3. Issue the following query:
 
SELECT V1.GROUP#, MEMBER, SEQUENCE#,
FIRST_CHANGE#
FROM V$LOG V1, V$LOGFILE V2
WHERE V1.GROUP# = V2.GROUP# ;
 
This will list all your online redolog files and their respective sequence and first change numbers.
 
4. If the database is in NOARCHIVELOG mode, issue the query:
 
SELECT FILE#, CHANGE# FROM V$RECOVER_FILE;
 
If the CHANGE# is GREATER than the minimum FIRST_CHANGE# of your logs, the datafile can be recovered. Just keep in mind that all the logs to
applied will be online logs, and move on to step 5.
 
If the CHANGE# is LESS than the minimum FIRST_CHANGE# of your logs, the file cannot be recovered. Your options at this point would be to restore
the most recent full backup (and thus lose all changes to the database since) or recreate the tablespace as explained in scenario a.
 
 
5. Recover the datafile: 
 
RECOVER DATAFILE 'full_path_file_name' ;
 
6. Confirm each of the logs that you are prompted for until you receive the message "Media Recovery Complete". If you are prompted for a non-existing
archived log, Oracle probably needs one or more of the online logs to proceed with the recovery. Compare the sequence number referenced in the
ORA-280 message with the sequence numbers of your online logs. Then enter the full path name of one of the members of the redo group whose sequence
number matches the one you are being asked for. Keep entering online logs as requested until you receive the message "Media Recovery Complete" .
 
7. If the database is at mount point, open it.
 
Operating Systems (OS) Tempfiles missing:
 
When using TEMPORARY tablespaces with tempfiles, the absence of the tempfile at the OS level can cause ORA-1157. Since Oracle does not checkpoint tempfiles, the database can be opened even with missing tempfiles.
 
The solution in this case would be to drop the logical tempfile and add a new one.
 
For example:
 
select * from dba_objects order by object_name;
select * from dba_objects order by object_name;
*
ERROR at line 1:
 
ORA-01157: cannot identify/lock data file 1026 - see DBWR trace file
ORA-01110: data file 1026: '/<disk_path>/temp2_01.tmp'
Solution:
 
alter database tempfile '/<disk_path>/temp2_01.tmp' drop;
 
select tablespace_name, file_name from dba_temp_files;
 
alter tablespace temp2 add tempfile '/<disk_path>/temp2_01.tmp' size 5m;
 
 
 
ORA-1157 due to OS issues/3rd party software
1. When trying to access Quick I/O files with vxfddstat, or other applications, getting an error message similar to "Cannot open file".
 
Oracle may return an error message similar to:
 
 
ORA-01157: cannot identify data file 1 - file not found
ORA-01110: data file 1: '/<disk_path>/system01.dbf'
The users need to contact Veritas support in this case. To access their support site, point your web browser to:
 
 
Click on: 'Product Listing'
Click on: 'File System for UNIX'
Enter 'Oracle' and click 'Search' to view relevant information from their Knowledge Base.
 
2. It is possible to get this error on HP if the kernel parmeter nflock is not set high enough. This might prevent Oracle to lock the required datafiles. Controlfile recreation might fail with ORA-27041 and ORA-1157 for the same reasons.
 
There may be more errors in the trace files in dump directory such as :
 
ORA-27086: skgfglk: unable to lock file - already in use
OR
 
ORA-01157: cannot identify/lock data file 263 - see DBWR trace file
ORA-0110: data file 263: '/<disk_path>/system01.dbf'
ORA-27041: unable to open file
HP-UX Error: 23: File table overflow
Additional information: 2
OR
 
 
ORA-07445: exception encountered: core dump [%s] [%s] [%s] [%s] [%s] [%s]
ORA-01110: data file %s: '%s'
ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode
ORA-01115: IO error reading block from file %s (block # %s)
ORA-27041: unable to open file
HP-UX Error: 23: File table overflow
Additional information: 3
To resolve these issues, increase the relavent kernel parameter on HP. The recommended settings would be :
 
nproc 4096 Max Number of Processes
nfile 63488 Max Number of Open Files
nflocks 4096 Max Number of File Locks
 
Refer to the OS Installation Guide for more information on these parameters.
 
To resolve the ORA-27041 'File table overflow' on Solaris, experience with the system settings below resolved -
 
/etc/system parameters
set vxfs:vxfs_ninode=692416
set ncsize=519312
 
Please note that tuning these parameters is outside the scope of Oracle Support.
For further info refer to Sun document available on sunsolve.sun.com (76671) which explains ncsize tuning.
And, tuning vxfs:vxfs_ninode is Veritas-specific, please post questions directly to Veritas.
 
3. It is possible to get ORA-1157, if the datafiles that Oracle requires are locked by some other process, for example a backup software might be locking
the files for backup.
 
On MS Windows, the user can get any of the following errors:
 
ORA-01157: signalled during alter database open
ORA-01157: can not identify datafile
ORA-01110: data file 10: '<disk_path>\index01.dbf'
ORA-27047: Unable to read header of file
OSD-04006: Read file failure
Error 33: process can not access file
The operating system error 33 is an error_lock_violation indicating that a portion of the data file is locked by another NT process.
 
OR
 
ORA-1157 - cannot identify datafile - file not found
ORA-1110 - datafile 11 '<disk_path>\index02.dbf'
ORA-9202 - sfifi: error identifying file
OSD-4006(OS 203) - The System could not find the environment option that was entered
 
 
Other error combinations that may show up in the alert log include:
 
ORA-1115 - IO error reading block from file %s (block # %s)
ORA-1110 - datafile 11 '<disk_path>\index02.dbf'
ORA-9206 - sfrfb: error reading from file
OSD-4006(OS 203) - The System Could not find the environment option that was entered
 
 
OR
 
ORA-1242 - data file suffered media failure: database in NOARCHIVELOG mode
ORA-1114 - IO error writing block to file &lt;name&gt; block #
ORA-9205 - sfqio: error reading or writing to disk
OSD-4016(OS 33) - The process cannot access the file because another process has locked a portion of the file.
 
Additionally, the following errors will appear:
 
KCF: write/open error dba=0x703473d block=0x3473d online=1
file=7 /<disk_path>/users01.dbf
error=9211 txt: 'OSD-4008 : WriteFile error (OS 203) - The System Could not find the environment option that was entered
 
In some cases, the alert log may also show errors such as:
 
Instance terminating due to error 1110.
Instance terminated by background process PID=<pid>
 
OR
 
background process TERMINATING INSTANCE DUE TO ERROR 472
 
ORA 472 - PMON process terminated with error
 
The following events may also be reported in the Microsoft Windows event viewer:
 
23 Error ReadFile() failure.
25 Error WriteFile() failure.
 
If this is a cold backup, need to wait until the backup is done and then
startup the database, or end the backup and startup the database.
 
Alternatively, the solution is that the backup software should be configured
such that it does not lock open files.
 
Refer to the BACKUP software documentation for information on how to do this.
 
For example, the Seagate Backup Exec Software, should be configured with
'Read and Not lock' option to take online backups.
 
This is true on some Unix platforms also.
For example, the IBM AIX ADSM backup utility can fall over before a successful run was performed, thereby holding the lock on Oracle datafiles.
 
The solution in this case would be to clear the lock on the datafile manually.
 
i. Run $ ps -ef | grep &lt;SID&gt; - look for an existing process on a datafile
 
ii. Do $ kill -9 on the process id
 
 
 
4. ORA-1157 is possible if the Oracle datafiles were copied to a directory using the File Manager on Windows.
This is true if the filenames are greater than the usual 8.3 format.
i.e., the files are of greater than 8 characters long name or have greater than 3 characters of extention.
 
The avoid this problem, when copying files with Windows 95/98, DO NOT use File Manager. If the files have long file names (e.g.: more than the
standard 8.3 file name convention), File Manager will rename the files with an 8.3 file name.
 
To preserve long filenames, Explorer should be used to copy files. If File Manager is already used and the files have a tilde (~) in the file name, the
files needs to be renamed to their original names.
 
5. ORA-1157 can be possible when using NETWORK appliance. Network Applicance acquires locks on the datafiles for certain operations.
These locks may be retained by the Network Appliance due to an instance or host machine failure. In these cases, the locks must be manually released from the Network
Appliance by the System Administrator.
 
The command for this would be:
 
As root on the netapp, from the prompt:
 
rc_toggle_basic
 
sm_mon -l &lt;hostname&gt;
 
The hostname being referenced in the above command should be the machine name where the Oracle instance is running from.
 
6. ORA-1157 is possible if Oracle files were restored as a different user.
 
After restoring a datafile and issuing recover datafile, the error ORA-1157 (cannot identify datafile - file not found) can occur. It does not appear
to recognize the restored datafile inspite that:
 
- The datafile exists at the OS.
- select * from v$datafile shows the correct path for the datafile.
- "alter system check datafiles" is successful.
- backup control file to trace shows full path to datafile is correct.
 
In this case, it could be a permissions problem at the OS level.
 
Check the permissions of the datafiles. When the datafile was restored, it may have been done by another user, other than Oracle such as root.
In this case, the file can be seen by Oracle, but not accessed. It cannot read and write to the datafile because it does not own the file.
Changing the ownership permissions to the Oracle user will allow the restore datafile command to succeed.
 
 
ORA-1157 - Other possibilities
1. Corrupt controlfile can cause ORA-1157
 
Users can get ORA-1157 in some extreme cases, where, the controlfile becomes corrupt.
 
One possible type of corruption that can result in these errors is having a trailing blank after the filename in the control file. This can be seen by
examining a text version of the control file produced using the 'ALTER DATABASE BACKUP CONTROLFILE TO TRACE' command.
(The trace file will be placed in the location specified by the user_dump_dest initialization parameter).
 
Example:
--------
 
'/<disk_path>/index1.dbf ' -- corrupt
'/<disk_path>/index02.dbf' -- non-corrupt
 
In these case, the solution would be to try using the other controlfile copies (if they are not corrupt) by changing the CONTROL_FILES parameter in
the init.ora to exclude the corrupt control file. If all the control files are corrupt, then the controlfile can be recreated.
It is also possible to have other 'unexpected' control characters embedded in the control file.
 
For example:
 
"^J" may be present in the datafile name.
 
The above solution will hold good in this case also.
 
 
2. Check if there was any tablespace/datafile added in the primary after the standby was setup.
 
If so, create the datafile(s) on the standby database manually. When the files exist, recovery can continue. If datafiles are not automatically created on the standby site.
 
For example:
 
The redo does not create a new datafile for you. The create datafile command from startup mount is:
 
alter database create datafile 'datafile_full_path_name';
 
 
3. RMAN restore can produce 'fake' ORA-1157 errors in the alert.log
 
During an RMAN restore operation, it is possible to encounter the following error(s) in the alert log:
 
ORA-01157: cannot identify/lock data file N - see DBWR trace file
ORA-01110: data file N: 'filename'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
This problem occurs if RMAN is restoring datafiles that have been removed from disk prior to the restore operation.
 
These errors (which occur multiple times for each affected datafile) can be a source of concern and confusion for DBAs. However, they are entirely
expected and, apart from monitoring the size of the alert log (and archiving it if necessary), are nothing to worry about.
 
4. ORA-1157 with newly installed seed database.
 
If seed database was used, Oracle will copy the datafiles from CD to disk and try to startup the database. But, if the datafiles are corrupt in the CD,
that is, if the CD is damaged, it could result in ORA-1157 error.
 
The solution to this would be to use new set of CDs or create database using manual creation scripts.