Aborted Module Name: FAIDCFAT_SM_GLBDATA-LOOP_01

Date: Day: Time: Resolution:

07/11/08 Fri 13:00 See Jan’s reply in follow-up section below.

Error log and follow up comments:

+ cat /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.DAT

+ 1> /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.selections_todo

cat: 0652-050 Cannot open /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.DAT.

+ exit 2

+ err=2

+ [ 2 -eq 0 ]

+ [ 2 != 0 ]

+ status=ABORTD

FAIDCFAT_FA.GLBDATA-LOOP_01 (and SM) failed because the conditions on the CHAIN_INIT were not executed. I manually executed the commands and restarted the GLBDATA-LOOP. Chains are complete.

Jan.

Aborted Module Name: FAIDSNTD.WAIT_FOR_CHAINS_01

Date: Day: Time: Resolution:

02/01/11 Tue 06:29 Restarted by ITS.

Error log and follow up comments:

*** ERROR: NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

5719026.01 FAID TDCLIENT_SEND 02/01 06:29 ABORTED AWPROD APPWORX

5719049.01 FAID TDCLIENT_SEND 02/01 06:37 ABORTED AWPROD APPWORX

5719059.01 FAID TDCLIENT_SEND 02/01 06:39 ABORTED AWPROD APPWORX

+ exit 1

Looks like it failed again - due to the 3 TDCLIENT_SEND modules which are in failed status. I tried restarting one of the TDCLIENT_SEND's, but it failed again:

+ tdclientc network=saigportal ftpuserid=TG51279 passive=Y data_over_command=y reset transfer=(name=CRDL11IN senduserid=TG51279 send=/ais01/dat/work/prod/CRDL11INsendfile other_comp_parms=secfile=/ais01/dat/work/prod/CRDL11INsecfile)

+ 1> /ais01/dat/work/prod/TDCLIENT_SEND.CRDL11IN.TXT 2>& 1

+ exit 19

+ err=19.

Apparently there's a communication problem with SAIG - maybe Phil can followup with FAID staff regarding situation with SAIG? Janice.

The TDCLIENT password has been changed (via FAIDSPWD_TDCLIENT_CHG_PASSWORD). I restarted the failed TDCLIENT_SEND modules and they completed successfully. Please restart the following failed components:

FAIDSNTD.WAIT_FOR_CHAINS_01

FAIDDLM2_EV.TDCLIENT_01.

Janice.

Aborted Module Name: FAIDALEX_EV.SSH_SFTP_01

Date: Day: Time: Resolution:

07/14/08 Mon 18:00 David deleted the FAIDALEX chain.

Error log and follow up comments:

- sftp

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_elmnet" SCH05FO@ftp.elmproduction.com

# > Authenticated with partial success.

# > Permission denied (password,gssapi-with-mic).

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

Child: Job return = 100

14 18:00:24- Child: put to memory:[100]

Janice left message below in News File.

“I'll leave the FAIDALEX_EV.SSH_SFTP_01 failure since this is a

communications issue with vendor that will need to be resolved with them.”

07/16/08 – Per David – “I deleted the FAIDALEX chain from yesterday.”

Aborted Module Name: FAIDPPNT_OD.LYNX_01

Date: Day: Time: Resolution:

06/12/13 Wed 11:05 Restarted by Joleen.

Error log and follow up comments:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>

<body>

You have requested a page that either never existed or no longer exists on this web server.<BR><BR>

The web page you are visiting is part of a larger site maintained by a department

at <a href="http://www.colostate.edu">Colorado State University</a>.<BR><BR>

If you came to this page via a "bookmark", this page may have been moved. You may be able to rectify this

error by contacting the webmaster of this website by going to the main site page and finding any contact information on that page. Otherwise, visit <a href="http://www.colostate.edu">CSU's main website</a>, and use directory search functionality to contact the department responsible for this website.

</font>

Hey Joleen, can you please change the lynx module’s url parameter?

From: http://wsprod.colostate.edu/cwis231/autorun/plus_email.cfm?ay={#1}

To: http://wsnet.colostate.edu/cwis231/autorun/JobChain/PlusEmail.aspx

It looks like I never asked to switch this over… sorry about that.

Zach.

It aborted again. Bummer! I have attached the standard output.

[Win32Exception]: The network path was not found

Here is the URL I used on the one I restarted:

http://wsnet.colostate.edu/cwis231/autorun/JobChain/PlusEmail.aspx

Joleen.

Try again, I made some modifications to the connections.

Zach.

Aborted Module Name: FAIDTRAK_EV.LYNX-01

Date: Day: Time: Resolution:

08/08/08 Fri 06:29 Restarted and David confirmed that error reported was correct

(see contents of Janice’s email below)

Error log and follow up comments:

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

+ orig_log_run=Y

+ export orig_log_run

+ log_run=Y

+ export status log_run

+ [ Y = N ]

+ [ -f /Applications Manager/exec/TROUBLE ]

+ [ -f /Applications Manager/exec/TROUBLE.CSU_UTILITIES ]

+ [ -f /Applications Manager/exec/TROUBLE.UNKNOWN ]

+ [ -f /Applications Manager/exec/TROUBLE.APPLICATIONS MANAGER_SHELLS ]

+ [ -f /Applications Manager/exec/TROUBLE.FAIDTRAK_EV.LYNX_01 ]

+ [ -f /Applications Manager/exec/TROUBLE.LYNX.KSH ]

+ export status

+ [ -f /Applications Manager/exec/COMPLETION ]

+ echo Executing COMPLETION

Executing COMPLETION

I noticed that the FAID schedule stalled out early last night due to this FAIDTRAK failure. I know that I haven’t had a chance to discuss the exclude date changes with you, but IT Scheduling may need to use that feature today if yesterday’s FAID schedule runs too long.

So, briefly – here’s the deal --

The portion of your documented procedure to:

n Update /ais01/dat/work/prod/AAAAAW99.WAIT_FOR_CHAINS_01.DAT with the current date (in mm/dd format), where AAAA is the 4-character application name

should be modified to:

n Update Applications Manager subvar #AAAAAW99_EXCLUDE_DATE with the current date (in mm/dd format), where AAAA is the 4-character application name. If update does not occur until after midnight, you would update with the before midnight date – i.e. the date that corresponds to the “scheduled” run date for the new day’s schedule.

I think this would be the only change to your documented procedure. The WAIT_FOR_CHAINS script will automatically set the Applications Manager subvar #AAAAAW99_EXCLUDE_DATE variable back to “NO_EXCLUDE_DATE” when it completes, so IT Scheduling does **NOT** have to worry about resetting the value when AAAAAW99 completes.

So, if yesterday’s FAID schedule runs too long and IT Scheduling must perform the after hours monitoring, then they would update the FAIDAW99_EXCLUDE_DATE with 08/08 (after yesterday’s FAID jobs have all completed) as documented in your procedures.

Janice.

Aborted Module Name: SCIQSQ2F.SCIQS009_01

Date: Day: Time: Resolution:

08/13/08 Wed 17:20 Restarted by Jan, see Jan’s comments below.

Error log and follow up comments:

+ + expr 6 + 1

loopcnt=7

+ [ 7 -lt 7 ]

+ [ n = y ]

+ /Applications Manager/exec/FILESIZE SCIQSQ2F.SCIQS009_01.1645544.1645545.00.2008_08_13_1700.jobout 100

no output from SCIQSQ2F.SCIQS009_01

+ err=100

+ date

+ echo exiting SQLP_CSU Wed Aug 13 17:00:45 MDT 2008

exiting SQLP_CSU Wed Aug 13 17:00:45 MDT 2008

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

SCIQSQ2F.SCIQS009_01 is complete, above was the error. Kelly created a synonym to resolve.

Jan.

Aborted Module Name: ADMSSATL.SRTLOAD-SRRSRIN_01

Date: Day: Time: Resolution:

08/29/08 Fri 14:38 Restarted by David after Marcella corrected data.

12/11/08 Thu 15:40 Restarted by David, see error he found below.

Error log and follow up comments:

08/29/08.

chain_status=SRTLOAD_COMPLETE

+ [[ SRTLOAD_COMPLETE != SRRSRIN_COMPLETE ]]

+ print SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

+ exit 1

+ err=1

A bad record in SRIPREL was causing this to fail. Marcella purged it and the job finished successfully.

David.

12/11/08.

SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

I restarted ADMSACTL.SRTLOAD-SRRSRIN_01.

I found the error:

ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","UPDATE GOTCMRT SET GOTCMRT_M...","sql area","tmp")

David.

Aborted Module Name: ADMSWEBC.SRTLOAD_01

Date: Day: Time: Resolution:

09/28/09 Mon 20:00 See note from Janice, Bev & Jan below.

Error log and follow up comments:

print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_Applications Manager_joblog_exceptions

+ egrep -i -f /ais01/dat/misc/prod/errstrg_Applications Manager_joblog /Applications Manager/out/ADMSWEBC.SRTLOAD_01.3369958.3369961.00.2009_09_28_2000.AWPROD.LOG

+ 1>> /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ rm -ef /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

rm: removing /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ grep FTP_

+ print ADMSWEBC.SRTLOAD_01

+ rm -ef /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

+ awexe get_var_value subvar=#critchain_3369958

+ + awexe get_var_value subvar=#critchain_3369958

this_critchain=N

+ [[ N = Y ]]

+ print NON-CRITICAL CHAIN COMPONENT FAILURE

NON-CRITICAL CHAIN COMPONENT FAILURE

+ show_status=ABORTED

+ status=ABORTD

+ rm -f /Applications Manager/run/temppar.4703455

+ 1> /dev/null 2>& 1

+ [ 139 != 0 ]

+ echo Non-zero error generated from running job. The program srtload failed to run successfully.

Non-zero error generated from running job. The program srtload failed to run successfully.

From the banner log file (ADMSWEBC.SRTLOAD_01.3369958.3369961.00.1680658.log):

Address information is missing for record with name of

Sierra Helterbrand and SSN of

From the banner lis file (ADMSWEBC.SRTLOAD_01.3369958.3369961.00.1680658.lis):

Number of Records Read from Tape: 19

Total of Prospects Loaded : 19

Total of PIDMs Matched : 0

Total of Conversion Errors : 22

All of the output files listed above should be viewable via Applications Manager Output File Viewer.

Janice.

Vicki and I searched the Banner UDC and found an entry that may have a solution. Mark Britton is checking it out.

Admissions is going to try to run the job without using Applications Manager, to eliminate that as a contributing factor.

Bev.

I’ve deleted ADMSWEBC.SRTLOAD_01 to allow the schedule to complete per Vicki.

Jan.

Aborted Module Name: FAIDALEX_EV.SSH_SFTP_01

Date: Day: Time: Resolution:

08/26/08 Tue 18:00 See Jan’s note below.

Error log and follow up comments:

The following Applications Manager module is in "EMPTY FILE" status:

FAIDALIM.SSH_SFTP_DL_01

27 08:16:07-Parent: sleeping for 10 seconds.

27 08:16:17-Parent: (2)Checking child process(786592)

27 08:16:17-Parent: Child process[786592] found

27 08:16:17-Parent: Checking child mem

27 08:16:17-Parent: Value in mem [N]

27 08:16:17-Looking for [/Applications Manager/run/kill.1688722.00]

27 08:16:17-No Kill File found('/Applications Manager/run/kill.1688722.00').

27 08:16:17-Parent: sleeping for 10 seconds.

# > Authenticated with partial success.

# > Permission denied (password,gssapi-with-mic).

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

I deleted out all modules left to run and running except CHAIN_FINISH. I requested FAIDALIM chain in to run again, deleting modules that had successfully run (verifying that the driver file had been built correctly). FAIDALIM is complete.

Jan.

Aborted Module Name: ODSRTEST.ODSRS002_07

Date: Day: Time: Resolution:

09/09/08 Tue 07:34 Re-started by David.

Error log and follow up comments:

ERROR at line 8:

ORA-06550: line 8, column 5:

PLS-00201: identifier 'CSUG_ODS_REFRESH.LOG_BEGIN_TIME' must be declared

ORA-06550: line 8, column 5:

PL/SQL: Statement ignored

ORA-06550: line 66, column 5:

PLS-00201: identifier 'CSUG_RUN_OWB_TASK' must be declared

ORA-06550: line 66, column 5:

PL/SQL: Statement ignored

ODSRPROD.ODSRS002_07 was set to the wrong login. I corrected this and re-started it.

David.

I notice that this is a new component for refreshing HR on ODS. It’s dependent on the EIDS ODS refresh, but I was wondering if it should be dependent on any HRMS updates completing?

Janice.

Aborted Module Name: DOITDEMO_01.FTPS_CURL_01

Date: Day: Time: Resolution:

09/19/08 Fri 17:00 See Janice’s note below, no output log to display.

06/19/11 Sun 21:47 Restarted by Steve G.

Error log and follow up comments:

09/19/08.

HRMSAW99 is waiting for DOITDEMO, which failed in the FTPS_CURL. I think this is a test run, so I leave it. But I put DOITDEMO into the HRMSAW99 exceptions file so HRMSAW99 can complete. For follow up, DOITDEMO should be changed from the HRMS application/queue to the DOIT application/queue which will eliminate holding up HRMSAW99 in the future…………………..Janice.

06/19/11.

# REMOTE USER : [$ZB01]

# > curl: Can't open '/ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT'!

# > curl: try 'curl --help' or 'curl --manual' for more information # > (26) #==============================================================================

# FATAL : Command failed with code : 26

#------------------------------------------------------------------------------

# 2011.06.30-21:47:53 : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

# > (100)

Elden and I have been looking at this abort, and we think we may have found the problem. With clues from the error message below, it looks like the DOITDEMO_01.HRMSS051_01.DAT file (utl_file1) was never sent to the /ais01/ftp/to/user directory by the HRMSDEMO_01.HRMSS051_01 component of HRMSDEMO_EXTRACT. However, the component DID save the same utl_file1 to the /ais01/bkp directory as

HRMSDEMO_01.HRMSS051_01.utl_file1.2011_06_30_2147.bak in a later step. This appears to be a timing issue.

If the HRMS folks can look at the

/ais01/bkp/HRMSDEMO_01.HRMSS051_01.utl_file1.2011_06_30_2147.bak file and see if the data looks good, we can copy that file to /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT and see if we can restart the failed DOITDEMO_01.FTPS_CURL_01 component. Please advise. Thanks!

Steve G.

Aborted Module Name: HRMSFRS_SAL.FTP_AIS01_AIS00_01

Date: Day: Time: Resolution:

10/23/08 Thu 11:30 See Janice’s note below.

Error log and follow up comments:

Net::FTP=GLOB(0x30334bac)<<< 451-Transfer aborted. Error during I/O processing. System code is B37-04

# put + : Transfer aborted. Error during I/O processing. System code is B37-04

The mainframe file ran out of space (B37)– I re-allocated PMDT.APPLICATIONS MANAGER.HRMSFRS.SAL.PFRS05J2.FRS in cylinders instead of tracks and resubmitted this component – it finished successfully.

Janice

Aborted Module Name: AREGORCC.CHAIN_INIT_01

Date: Day: Time: Resolution:

10/30/08 Thu 15:30 See David’s , Janice’s & Dawn’s note below.

Error log and follow up comments:

this_chain_start=30-Oct-2008 15:22:10

+ cat /dev/null

+ Cannot write to a directory.

/Applications Manager/csu/exec/CHAIN_INIT.KSH[42]: /ais01/ftp/to/eprint/: 0403-005 Cannot create the specified file.

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

I talked to Dawn and she will re-submit this. The AREGORCC chain was brought in without a schedule prefix.

David.

This chain was requested in without providing the chain prefix value. Whenever chain components appear in backlog with names like “.CHAIN_INIT_01” – i.e. no chain prefix preceding the “.”, then it is a problem with the way the chain was requested to run.

By the way, if the “request” procedure is to be used to request this chain, we should probably change the prompt #1 to “value required” which will prevent the chain from being requested without providing a value for prompt #1

Or… we could set up a list of values with the “valid” chain prefix values and then IT Scheduling could use the request procedure and just choose the correct chain prefix value from the LOV (similar to what we are doing with HRMSAW90).

Janice.

When Denise requested this chain to run today, the chain notes said to use the Request procedure. I spoke to Jan and she had me update the note so it said Schedule procedure.

Dawn.

Aborted Module Name: AROSFRQ1.GLBEXTR_POPSEL_01

Date: Day: Time: Resolution:

01/13/10 Wed 01:00 DB error restarted by David.

Error log and follow up comments:

I got paged at 1:00am about a DB ERROR on

AROSFRQ1.GLBEXTR_POPSEL_01. There was no output file, but

Conditions showed Timing of "BEFORE" and Performed

of "DONE". Called David and left a message with the

information above. He called back and said he was logging

in to check it out.

Steve.

AROSFRQ1.GLBEXTR_POPSEL_01 failed with a DB ERROR. I reset

DONE conditions and re-started. There were no other jobs

with errors.

David.

Aborted Module Name: ODBAMNTR.ODBAS001_01

Date: Day: Time: Resolution:

09/08/09 Tue 07:00 See note from Janice below.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-12541: TNS:no listener

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

Received EM alerts at approximately 03:00 this morning that production systems were down. It appears the recycle job got hung up. I killed the recycle jobs and manually ran the oracle_system_startup script and oracle_famis_system_startup script. Everything appears to be up and running now. Mark. B.

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with: ORA-12541: TNS:no listener

These ODS refresh chains are dependent on the Applications Manager recycle chain (ODBACYCP_RECYCLE_PROD_SYSTEMS) which contains the ODBA_RECYCL_PRD chain component to execute the /app/oracle/admin/dba/mgr/oracle_system_recycle script. Even though the recycle did not successfully complete last night, /app/oracle/admin/dba/mgr/oracle_system_recycle apparently returned a zero return code -- thereby allowing dependent ODS Refresh jobs to proceed and subsequently fail due the TNS no listener problem.

Would it be possible for the /app/oracle/admin/dba/mgr/oracle_system_recycle script to return a non-zero return code when such problems occur? If the error could be trapped, then dependent ODS Refresh Chains would not run until the problem was resolved and appropriate DBA, plus Applications Manager followup, had been done. When manual activity is taken to resolve the oracle recycle problem, the failed ODBA_RECYCL_PRD chain component of the ODBACYCP_RECYCLE_PROD_SYSTEMS chain would also need to manually be deleted, which would then allow dependent ODS Refresh chains to proceed.

The dependency connection between the "Oracle System Recycle" chain and ODS Refresh Chains will only be meaningful and effective if failure(s) in the "Oracle System Recycle" script are detected and reported back to Applications Manager. Janice.

There was no failure, it got stuck so there really was no way to report this back programmatically. There looked like there was some code in one of the scripts to detect that the process is hung but as far as I could tell it didn't work. Mark. B.

For the second time in two weeks, Sunday night ODS Refresh/Applications Manager Chains had components which failed with Enter user-name: ERROR: ORA-12541: TNS:no listener

due to problems with the Oracle recycle process. Although, the 8/23 situation was apparently not programmatically eligible for reporting back to Applications Manager -- I'm wondering if last night's situation (9/7/09) was something that should/could have been reported back to Applications Manager and/or resulted in a DBA page? Janice.

Nope it wasn't. Mark. B.

Aborted Module Name: HRMSSQWL.SQWLARCH-LOOP_01

Date: Day: Time: Resolution:

01/13/09 Mon 07:30 See Debbie’s & Janice’s notes below.

Error log and follow up comments:

+ egrep ABORTED|CRITFAIL|C-Error

+ awexe jh

+ grep 2220862

2220862.00 BATCH HRMSSQWL.SQWLARCH_CO01/12 17:23 02:11:48 C-Error DGUZMAN HRMSSQWL_STATE_QTRLY_WAGE_LIST

+ print Failure in spawned CO - abort SQWLARCH-LOOP

Failure in spawned CO - abort SQWLARCH-LOOP

+ exit 1

+ err=1

SQWLARCH FAILED – Colorado

Please refer to email that was generated from the ABORT on Monday at 5:23pm and follow IT Scheduling instructions.

Debbie

As a reminder, the SQWLARCH-LOOP is structured to automatically email the user (and IT Scheduling) regarding SQWLARCH failures. The email for the failed SQWLARCH for Colorado was sent when it failed at 05:23 P.M. yesterday. Elaine should be doing the normal manual follow-up and then will contact IT Scheduling to request a restart. Please refer to the various HRMSSQWL related emails sent late last week for more details.

After reviewing those emails, if you have any questions about the process then let me know.

Janice

The sqwl stuff directly sends a follow-up email to the users (HRSAO SQWLArch Followup), as well as to the Alert HRMS WHRS and Alert APMX lists. Please refer to the earlier mail, with subject PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado, dated Tue 7/12/2011 9:58 AM. Because we have this automated feedback reporting for the SQWL failures, there is no need to also send the normal HRMS ABORT followup email.

Janice.

Aborted Module Name: HRMSW2P2.WAIT_FOR_W2PDF_01

Date: Day: Time: Resolution:

01/13/09 Tue 20:46 See notes below.

01/12/10 Tue 15:50 See notes below, similar to previous year’s ABORT.

Error log and follow up comments:

01/13/2009.

There is no output file to look at – what would be our next step? Thanks...

Look at the module conditions on that module.

Jan.

Today Jan was showing us that this module aborts after it cannot find the PDF file after 5 hours. Janice said this was due to the HRMS CONCURRENT MANAGER JOB FAILURE. Has the problem been solved? Could this be the reason this chain has failed again?

The W2s have been running for over 5 hours which is causing this abort message.

The problem is that the W2s are taking a long time to generate.

Alan.

While this morning the problem was a HRMS CONCURRENT MANAGER JOB FAILURE, the latest abort was simply due to the fact that we exhausted the time interval for checking for the spawned concurrent process(es) to complete. The spawned concurrent process to generate the W2 PDF’s is still running. I’ve restarted the failed component and increased the time interval.

Janice

P.S. I’ve talked with Ken about checking on this chain tonight – so I’ll plan to monitor it over the course of the evening.

01/12/2010.

There is no output file.

HRMSW2P2.WAIT_FOR_W2PDF_01 timed out at 5 hours waiting for the W2 file to be processed. I gave it more time and re-started it.

David.

Aborted Module Name: HRMSACH_SAL.PAYUSXFR_01

Date: Day: Time: Resolution:

01/21/09 Wed 08:30 See David’s note below.

01/23/09 Fri 09:30 See David’s note below.

Error log and follow up comments:

01/21/09.

PAYUSXFR had some date variables that wrapped to the next field. We manually fixed the variables and the job is complete.

David.

01/23/09

HRMSACH_SAL.PAYUSXFR_01 had problems with variables wrapping. I manually fixed this and it is complete.

David.

Aborted Module Name: VSTAJOBS.VPLUS_MIGRATION_01

Date: Day: Time: Resolution:

03/29/10 Mon 07:05 See follow up below.

Error log and follow up comments:

mv: 0653-401 Cannot rename vmfiles/CompressGens.lis.old to vmfiles/CompressGens.lis.old.old:

A file or directory in the path name does not exist.

+ exit 7

Child: Job return = 7

29 07:05:40- Child: put to memory:[7]

29 07:05:40- Child: In memory:[7]

Child:Done.

29 07:05:40-Child:Done

Jan.

This job did product error messages for /vptmp/tmp not found and the ‘mv: 0653-401’ messages when it was rerun a bit later.

I resolved the problem early today and ran the migration process which worked.

For some reason every time we restarted the job it just kept saying it aborted in Applications Manager.

I could not find any reason it kept failing when it worked outside of Applications Manager. My only suspicion is something was searching the log

for Abort messages and it kept thinking it failed when it didn’t .

The resolution after I fixed the issue was to have Jan delete the job and rerun it fresh. This time it worked fine.

I am looking into a possible issue with some reports captured on Friday night that are in the data base but not in the Vista reports filesystem.

The reports would show in Vista Plus, but not be accessible because the report file is not there. Only a database entry.

We are looking into some reports based on generation sequence number so we might be able to tell what reports it affected.

Then we can cleanup the empty reports and rerun them in Applications Manager if we can.

Rich.

Aborted Module Name: ODSRTEST.ODSRS002_07

Date: Day: Time: Resolution:

02/02/09 Mon 07:39 See notes below.

Error log and follow up comments:

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUH_CURRENT_PERSON

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 75

Mapping has been refreshed manually. HR Test db must have been off-line last night.

Please do not restart the chain. The refresh of HR on ODS Test will run tonight.

Mark.

We deleted this chain. As it turns out, we should have only deleted the module per Jan.

When communicating regarding course of action for failed Applications Manager chain components, it is important that there is a clear understanding of the Applications Manager terminology (chain component vs. chain) and the difference between deleting an Applications Manager chain component vs. deleting an Applications Manager chain from backlog. While the desire may have been to not restart the failed ODSRTEST.ODSRS002_07, deletion of the ODSRTEST chain not only deleted the failed ODSRTEST.ODSRS002_07 component but also deleted the following chain components:

2009-02-02 07:26:04.0	2009-02-02 07:26:04.0	00:00:00	ODSRTEST.ODSRS004_02	DELETED	ODSRTEST_REFRESH_ODSTEST	ODSR	AWPROD	2318769
2009-02-02 07:26:04.0	2009-02-02 07:26:04.0	00:00:00	ODSRTEST.SEND_MAIL_01	DELETED	ODSRTEST_REFRESH_ODSTEST	ODSR	AWPROD	2318755
2009-02-02 07:26:04.0	2009-02-02 07:26:04.0	00:00:00	ODSRTEST.CHAIN_FINISH_01	DELETED	ODSRTEST_REFRESH_ODSTEST	ODSR	AWPROD	2318753

The ODSRTEST.ODSRS004_02 chain component which was deleted would have updated the ODSTEST csug_ods_refresh_status table with an “end” time for the OVERALL_NIGHTLY_REFRESH table entry – to indicate the time that all ODSRTEST refresh components had completed. Currently, the ODSTEST csug_ods_refresh_status table entry for OVERALL_NIGHTLY_REFRESH has a BEGIN_TIME value of 01-FEB-09 11.00.49 PM, but a null END_TIME value due to deleting the ODSRTEST.ODSRS004_02 chain component.

The ODSRTEST.SEND_MAIL_01 chain component which was deleted would have sent the “ODSTEST Refresh Statistics” summary email to the ODSR email list.

Finally, the ODSRTEST.CHAIN_FINISH_01 chain component which was deleted would have performed chain cleanup, including deletion of chain specific Applications Manager subvars and deletion of /ais01/dat/work/prod/ODSRTEST* work files. Also, the CHAIN_FINISH component has a BEFORE condition to set the subvar value: #ODSR_RUN_ODSRTEST={#ODSR_RUN_ODSRTEST_SETVAL}

By deleting the CHAIN_FINISH component, this BEFORE condition was not performed which, in this particular situation, did not cause problems because the #ODSR_RUN_ODSRTEST_SETVAL subvar had a same value as the current value of #ODSR_RUN_ODSRTEST. However, if #ODSR_RUN_ODSRTEST_SETVAL had been different than #ODSR_RUN_ODSRTEST, deletion of this chain component would have resulted in #ODSR_RUN_ODSRTEST having an incorrect value.

Janice.

Aborted Module Name: AROSFRQ1.TGRAPPL_01

Date: Day: Time: Resolution:

04/27/11 Wed 07:15 See follow up from Janice below..

Error log and follow up comments:

Username:

Password: Connected.

tgrappl completed successfully

0 lines written to /appworx/out/AROSFRQ1.TGRAPPL_01.6152805.6158460.00.2273961.lis

Starting TGRAPPL (Release 8.1.1.1)

*********************************************************

* **WARNING** *

* You cannot submit this job - it is already running. *

* *

* You will also get this message if a previous run of *

* this program aborted. If this is the case, the *

* control record for that run must be deleted before *

* proceeding. (GJBPRUN record for this jobname with *

* a -1 one-up-no). *

* *

There was a timing problem between the AROSFRQ1 spawned AROS_PYMTS chain, in which TGRAPPL is executed, and the AROSDPA3 chain that also executes TGRAPPL. AROSDPA3 is dependent on AROSAM27_STOP_AROSFRQ1_PYMTS, which creates the /ais01/dat/apwx/prod/AROS-PYMTS-LOOP_daily_stop file. The presence of this file prevents the AROSFRQ1 AROS-PYMTS-LOOP script from spawning any new AROS_PYMTS chains. However, in the situation which occurred this morning, it appears that the timing was such that the AROS_PYMTS chain had already been spawned, but not enough time had elapsed between AROSAM27_STOP_AROSFRQ1_PYMTS and AROSDPA3 to allow for the AROS_PYMTS chain to complete.

I've added a 5 minute delay to AROSAM27 after creation of the AROS-PYMTS-LOOP_daily_stop file, which hopefully will prevent this situation in the future.

I actually thought some database cleanup was required when two TGRAPPL executions stepped on each other's toes... but I did try to restart AROSFRQ1.TGRAPPL, thinking that it wouldn't work anyway. However, much to my surprise, it completed successfully - sorry, I would have passed the restart on to Dawn had I really thought it would work :)

Janice.

Aborted Module Name: HRMSENCD.HRMSS103_01

Date: Day: Time: Resolution:

04/28/09 Tue 07:38 See notes below.

Error log and follow up comments:

HRMSENCD.HRMSS103_01 has been running for over 11 hours, delaying the completion of the HRMS encumbrance processing (Applications Manager chain HRMSENCD_DAILY_ENCUMBRANCES). Consequently, the remainder of the HRMS/WFRS/WHRS Applications Manager schedules are also waiting for completion of the encumbrance HRMSENCD_DAILY_ENCUMBRANCES chain.

Please advise regarding HRMSS103 – and the course of action which should be taken.

Janice.

Craig killed the HRMSENCD.HRMSS103_01 associated Oracle process and we’ve restarted it. However, historically HRMSS103 only runs 2-3 minutes and it has already been running for more than 10 minutes – how long should we allow HRMSS103 to run?

Janice.

When I tried to locate the error, Applications Manager froze.

I tried to get back into Applications Manager and the same thing happened.

Module HRMSENCD.HRMSS103_01 is in CRITFAIL status. So, Joleen and I both tried to look at the output file so we could send out an e-mail. Jan called to say the output file was way too big (80599209 Bytes) and trying to open it is what took Applications Manager down. She does not want anyone to try to open this output file. I asked her if there was a way to know how big is too big. She said she was talking to her co-workers about it and they don’t know the answer. So, please do not open the output file for the above mentioned module.

Dawn.

Aborted Module Name: AROSDTRN.TSRCBIL_01

Date: Day: Time: Resolution:

09/29/09 Tue 08:30 See note from Janice below.

Error log and follow up comments:

The following module is in LAUNCH ERROR status:

AROSDTRN.TSRCBIL_01

I already tried to restart it.

Jan called to say they are working on this.

Dawn.

While the banprod_jobprd AWPROD link to banprod is able to execute other sql to populate Applications Manager variables, the problem seems to be isolated to the inability to execute a function via the link – note that the jobprd userid directly logged onto banprod can execute the function(s).

As a short term solution, I’ve made backup copies of the various AROS subvars which execute function(s) in the underlying SQL logic.

Then I manually executed the functions (as jobprd on banprod) to determine the value which would have been returned and hard-coded the resultant values into the corresponding AROS Applications Manager subvars as shown below. This short term solution of hard-coded values will be adequate until the resultant values would be different from those which I hard-coded – but hopefully it will give us a short time period in which to solve the problem but still allow the various AROS chains to run in the meantime.

#AROS_CUR_TERM Type=Numbers {200990}

#AROS_CUR_TERM_bkp Type=Numbers {SQL}

#AROS_NEXT_TERM Type=Numbers {201010}

#AROS_NEXT_TERM_bkp Type=Numbers {SQL}

#AROS_PREV_TERM Type=Character {200960}

#AROS_PREV_TERM_bkp Type=Character {SQL}

Janice.

Aborted Module Name: ADMSAPPL.LYNX_01

Date: Day: Time: Resolution:

05/21/09 Thu 22:20 Restarted by Janice.

Error log and follow up comments:

URL=http://wsprod.colostate.edu/cwis116/application/BanTranPay.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

I talked to Bev about this aborted component and learned that the normal course of action is to just restart it – so it’s running again.

Janice.

Aborted Module Name: FAIDVRWF_EV.FAIDS025_01

Date: Day: Time: Resolution:

04/14/11 Thu 15:30 See follow up below.

Error log and follow up comments:

Enter user-name: ERROR:

ORA-12519: TNS:no appropriate service handler found

Enter user-name: SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | / Enter user-name: SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

+ [ -f login.2040276 ]

+ echo Could not login to sqlplus

Could not login to sqlplus

+ err=1

Since this is a cyclic chain, we already have 3 more in self wait status. Please delete 2 of the ones in self wait status, then delete this failed component. I think the Banner problem has been resolved.

It would be okay to go ahead and delete the FAIDVRWF self wait chains which will keep coming into backlog every 15 minutes - then when the problem is resolved, we'll have been keeping current on cleaning those up. We won't need/want to run all the backlogged ones.

Janice.

Aborted Module Name: OSYSJOBS_06.OSYSPURG_01

Date: Day: Time: Resolution:

12/11/09 Thu 16:31 See note from Janice below

Error log and follow up comments:

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

rm: Removing /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 \n*** EXIT WITH EXIT CODE=1 \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1

*** EXIT WITH EXIT CODE=1

***

+ exit 1

Child: Job return = 1

10 16:31:06- Child: put to memory:[1]

10 16:31:06- Child: In memory:[1]

The OSYSPURG jobs failed trying to perform cleanup of the /alm_orautl/b directory, which apparently no longer exists. I’ve created a temp - /ais02/job/temp/sys_purg_rsh.ksh to bypass the “b” instance orautl cleanup. The logic in sys_purg_rsh.ksh is driven from the /orautl directory, so DBA’s should remove the /orautl/b links which exists on Empire and Kebler to the non-existent /alm_orautl/b directory if the “b” instance has been deleted. There also may be other obsolete links in the /orautl directories (BAN8@ -> /cre_orautl/BAN8, BANTRNG@ -> /ban_orautl/BANTRNG/). I also noticed that in some cases the /orautl link points to a /***_orautl directory, but the same directories exist in other directories - example (/orautl/BANTEST@ -> /cre_orautl/BANTEST/ -- but BANTEST also is a subdirectory under the /ban_orautl directory structure).nice

I noticed that the logic in sys_purg_rsh.ksh is so old that we only had the orautl cleanup being performed for the “a”, “b”, and various “hr” instances. The current /orautl cleanup criteria within sys_purg_rsh.ksh is any file older than 7 days within the /orautl /a or /orautl/hr* directories. Should we also be cleaning up the various BAN*, ods* and kfs* instances? As examples, /orautl/BANPROD has files dating back to 2005 and /orautl/odsprod space used is 70470341, with the most of the large files dated between Jan 2009 and March 2009 .

Janice.

Aborted Module Name: AREGORGN.SPOOL_TO_PRINT_02

Date: Day: Time: Resolution:

10/29/10 Fri 17:25 Deleted by Janice.

Error log and follow up comments:

# -> [*******************************************************************************]

# -> [FATAL EXIT CALLED FROM [spool_filter::fatal]] # -> [-------------------------------------------------------------------------------]

# -> [ERROR: file /ais01/spool/out/AREGORGN.AREGS706_01.5280320.txt not found] # -> [-------------------------------------------------------------------------------]

# -> [[ 2010.10.29-17:25:19 ]]

# -> [RETURN CODE = 100]

# -> [===============================================================================]

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

10/30/2010 10:44 JMWILKIN

I saw the emails from Joleen about the AREGORGN failure and decided to take a quick look. The failure in the SPOOL_TO_PRINT, as well as a spawned SEND_MAIL, are due to the fact that the utl_file1 from the AREGS706 component was empty. As followup, there should be a task to correct the AFTER conditions on the AREGS706 so spool driver entry and SEND_MAIL are not done when utl_file1 is empty.

I deleted these failed components so the AREGAW99 chain can complete.

Problem has been resolved – the AREGS706 generated no output to be printed or emailed. I deleted the failed components so the chain could complete.

Janice.

Aborted Module Name: FAIDCFIM_FA.COF_RESP_01

Date: Day: Time: Resolution:

09/21/10 Tue 12:20 See notes below.

Error log and follow up comments:

Phil has just informed us that COF will not have a file available today for the FAIDCFIM_FA.COF_RESP_01 to process. While this component would eventually abort when it doesn’t find the file, it would be best to simply handle the situation now.

Please proceed with the steps outlined below, in the order specified:

1) Kill the FAIDCFIM_FA.COF_RESP_01 component – it should end up in KILLED status

2) Delete all the chain components which are in PRED WAIT status, except for the FAIDCFIM_FA.CHAIN_FINISH_01 component.

My preference is to display the chain in backlog via Flow Diagram, then select all the components to be deleted (in this case, FAIDCFIM_FA.DECRYPT_01 through FAIDCFIM_FA.VPLUS_RCAP-LOOP_01), then right click and select Delete 6 (the 6 indicates you’ve selected six components to be deleted).

3) Verify that all chain components which were deleted are in PW-DELETE status.

4) Delete the “KILLED” FAIDCFIM_FA.COF_RESP_01 component.

5) Verify that the FAIDCFIM_FA.CHAIN_FINISH_01 component finishes, thereby allowing FAIDCFIM_COF_IMPORT chain to complete.

On a more generic note, we often prefer to allow the CHAIN_FINISH chain component to run when we are deleting a chain that has started, but due to a failure or other reasons, is not to run to completion. One of the key reasons is that the many chain specific subvars which have been defined for the chain will be deleted via the CHAIN_FINISH component, as well as other general cleanup of work files and so on. However, it cannot be globally said that it would always be safe to run the CHAIN_FINISH component. Therefore, research would be necessary to determine if the CHAIN_FINISH component (or its associated BEFORE/AFTER conditions) would be taking any action(s) which should NOT be performed. As an example, the CHAIN_FINISH component of the FAIDCFEX_COF_EXPORT chain has an AFTER condition to request in the corresponding schedule of FAIDCFIM_COF_IMPORT. Obviously, if we are attempting to delete remaining components of a FAIDCFEX_COF_EXPORT chain, we would NOT want this condition to be performed. In this case, if we decide to let the CHAIN_FINISH component run, while deleting the remainder of the chain components, we would first have to disable the CHAIN_FINISH conditions to prevent them from running. CHAIN_FINISH components also may have filenames specified for the “Files to backup”, “Files to empty”, or “Files to delete” prompts which we may not wish to backup, empty or delete. In general, research is the key to safely allowing the CHAIN_FINISH component to run when deleting the remainder of the chain components.

Janice.

Aborted Module Name: OSYSJOBS_04.OSYSPURG_01

Date: Day: Time: Resolution:

12/11/09 Thu 16:37 Restarted by Janice.

Error log and follow up comments:

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

rm: Removing /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 \n*** EXIT WITH EXIT CODE=1 \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1

*** EXIT WITH EXIT CODE=1

***

+ exit 1

Child: Job return = 1

Janice.

Aborted Module Name: FAIDCFAT_FA_GLBDATA-LOOP_01

Date: Day: Time: Resolution:

09/07/10 Tue 06:00 See note from Janice below.

Error log and follow up comments:

I’m including the DBA’s on this email, as it appears with the many Appworx failures we have a problem with the databases (all appropriate instances are in restricted mode).

The error in FAIDCFAT_FA.GLBDATA-LOOP_01 was:

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

Consequently, to determine the source of the problem, the output log from the spawned GLBDATA must be viewed to determine what caused the failure in FAIDCFAT_FA.GLBDATA_01:

ERROR:

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

Oh.. wait, it was giving us LAUNCH ERRORS so I thought it might be AWPROD… but it was still related to BANPROD because the module attempting to launch uses Appworx subvars which query BANPROD to obtain the value for the subvar.

I’ll try it again after I update the so_job_queue table because it has been retried too many times and the Operator Log for that job has filled up. As a reminder, IT Scheduling should further investigate the problem and/or solicit help if a LAUNCH ERROR status chain component repeatedly goes into LAUNCH ERROR status upon retry. An easy way to investigate is to view the Operator log for the component – in this case, it revealed the following error:

2010-09-07 07:13:06 status action QUEUED by JWEARNE

RmiServer 09-07-2010 07:13:07 MDT

Job launch error: 5013708.06 agent: AWPROD host: kebler.is.colostate.edu

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

In this case, an ORA message was displayed each time the component was resubmitted.

Janice.

Aborted Module Name: AGENDYGN.AGENS006_01

Date: Day: Time: Resolution:

07/10/09 Fri 20:00 See Janice’s comments below.

Error log and follow up comments:

ERROR at line 1:

ORA-29282: invalid file ID

ORA-06512: at "SYS.UTL_FILE", line 802

ORA-06512: at line 671

ORA-06512: at line 1511

ORA-29280: invalid directory path

I noticed the AGENDYGN.AGENS006_01 failure and tried a resubmit, but it failed again. Weird situation -- looks

like it doesn't like the /orautl/BANPROD directory.. Oh wait.. sql changed today although last modlog entry is

6/12/09 and the utl path is hard-coded in the sql as /orautl/BANTEST while the line to use &&utl_path has

been commented out??? I could fix that, but maybe the version of sql in prod isn't what should be there -- i.e.

why was it changed today and no recent modlog entry is present? Why is the version in prod using /orautl/BANTEST

hardcoded logic? Sounds like a test version of the sql got placed into production, so I'll let others followup in the

A.M.

Janice.

Aborted Module Name: AROSDGL1_AROSS167_01

Date: Day: Time: Resolution:

12/16/09 Wed 09:33 Restarted by David.

Error log and follow up comments:

ERROR at line 1:

ORA-20100: ::ID does not exist.::

ORA-06512: at "BANINST1.GB_COMMON", line 451

ORA-06512: at line 367

09:32:03 367 fetch trx_cur into trx_rec;

I am assuming that something changed for a person between the load of GURFEED and the run of AROSS167.

I am looking into the bad record now. I will respond with a solution shortly. Josh

There are 80 total transactions with an invalid GURFEED_ID. Below is the breakdown.

1 null 62

2 824109854 18

I will follow up with AR to see why these do not have valid ID’s. Josh

The Null values are not causing any problems.

The 824109854 has been modified. Vicki is working to determine what was going on with that ID. Once a decision has been made on what to do with it we can restart the module.

Josh.

Module was restarted and is now complete.

Jan.

Aborted Module Name: HRMSQPD.CHAIN_SUMMARY_01

Date: Day: Time: Resolution:

07/07/09 Tue 08:40 No follow up received.

Error log and follow up comments:

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

- begin if lower(:so_user_name

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

The Appworx userid was deactivated by a failed mkbanner. Mark will research why this happened – might be an issue with the banprod_general link. The mkbanner works fine on AWTEST, using the bantest_general link. I reset the appworx userid within Appworx to “active” and restarted the following jobs which had failed with this error:

Janice.

Aborted Module Name: HRMSDAY1.HRMSS009_01

Date: Day: Time: Resolution:

12/16/09 Wed 06:32 See note from Janice below.

Error log and follow up comments:

Both HRMSS007 and HRMSS009 failed with:

ERROR at line 1:

ORA-12541: TNS:no listener

which I believe was caused by problems with ODS or this link to ODS. Both of these sqls use the csug_gp_demo_v view, which selects data from csuban.csug_gp_demo@odsprod.world.

When I attempt via an sql (on hrprod) to just select count using this link to odsprod, the same TNS no listener error is produced:

07:20:07 SQL> select count(*) from csuban.csug_gp_demo@odsprod.world

07:22:49 2 /

select count(*) from csuban.csug_gp_demo@odsprod.world

ERROR at line 1:

ORA-12541: TNS:no listener

Janice.

Aborted Module Name: KFSXAPIM.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/24/09 Fri 07:04 See response from Janice.

Error log and follow up comments:

The error is :

ERROR: Exception caught:

Caused by: org.springframework.beans.factory.CannotLoadBeanClassException: Error loading class [edu.csu.kfs.fp.batch.service.impl.ProcurementCardCreateDocumentServiceImpl] for bean with name 'procurementCardCreateDocumentService' defined in class path resource [edu/csu/kfs/fp/spring-fp.xml]: problem with class file or dependent class; nested exception is java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Caused by: java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Same error occurred in KFSXAPIM.KFSX_JAVA_01.

Last week we encountered this error in kfsdevl and it required a change to the kfsdevl_Applications Manager.env file. I’m wondering if the build last night to kfsprd now requires the same change to kfsprd.env file? Email traffic related to kfsdevl.env shown below:

Shawn.

Can you add $LIB_KFS/groovy-all-1.6.4.jar:\ to kfsdevl_Applications Manager.env…………Kevin

Can a DBA please follow-up on this?

By the way, encumbrances did NOT feed to KFS last night due to this problem – that’s one of the jobs that failed. Since HR has requested that no encumbrances run starting tonight through Sept 30, it is critical that we successfully post last night’s encumbrances to KFS by finishing out this job. However, the design of the job is to shutdown Tomcat – run scrubber/poster – then start Tomcat back up. This will impact KFS users!!

Since the post of encumbrances did not happen last night, the WHRS_CUR_FY_JOBACCT_MTH_00 table refresh was skipped. However, since we need to post encumbrances to KFS and it’s running now -- I requested the WHRSL023 chain back into backlog so we can force a refresh of the WHRS_CUR_FY_JOBACCT_MTH_00 table. Otherwise, we would be out of sync – plus WHRS_CUR_FY_JOBACCT_MTH_00 table won’t be refreshed for the rest of the month due to HR suspending HRMS encumbrance processing through Sept 30.

Janice.

Aborted Module Name: KFSXCS20.HRMSS174_01

Date: Day: Time: Resolution:

07/21/09 Tue 21:11 See Janice’s note below.

Error log and follow up comments:

ERROR at line 1:

ORA-01410: invalid ROWID

ORA-06512: at line 228

21:11:35 228 for C1 in get_encumbrance_amounts (l_start_date, l_end_date) loop

This will be a good test of the feature for the later GL update chain (KFSXGL_D2) to proceed without encumbrances - it will be released by KFSXAW11 at 1 A.M. -- although I may

just release it now since there isn't anything to wait for (KFSXCS20 already failed).

Janice.

Aborted Module Name: FAIDTMIM.TDCLIENT_01

Date: Day: Time: Resolution:

04/30/10 Fri 08:33 Restarted by Janice.

Error log and follow up comments:

I could not find lines 54 or 110.

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

- begin if lower(:so_user_name

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

Janice.

Aborted Chain Name: AROSDGLI.SEND_MAIL_01

Date: Day: Time: Resolution:

08/13/09 Thu 19:12 See note from Janice below.

Error log and follow up comments:

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cp: /orautl/BANPROD/AROSDGLI.AROSS162_01.utl_file1: A file or directory in the path name does not exist.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

Josh will manually create this missing file (and associated .recon file) and we will manually copy these files to the appropriate filenames in /ais01/dat/aros/prod directory so tonight’s KFSX Enterprise Feed will pick them up.

I’ve restarted the failed component to allow the AROS schedule to proceed.

Janice.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

08/13/09 Thu 07:04 See Janice & John Hunter’s note below.

Error log and follow up comments:

KFSXFPPD Failure - Error:

2009-08-13 07:08:25,393 [main] INFO org.kuali.rice.kew.docsearch.SearchableAttribute :: ...finished indexing document 359577 for document search.

2009-08-13 07:08:26,334 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.3178509.3178550.00 steps: [disbursementVoucherPreDisbursementProcessorExtractStep]

2009-08-13 07:08:26,335 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.IllegalArgumentException: Unable to find customer profile for M/CSU/DV

RunBatch ERROR: Exception found:

java.lang.IllegalArgumentException: Unable to find customer profile for M/CSU/DV

Just as with the problem yesterday with the chain component failure within KFSXPDSA, this KFSXFPPD failure has halted the daily KFSXPD_DY_PDP_DAILY_CHECK_ACH Applications Manager chain. The PDP Daily Check/ACH Processing for 12-AUG-2009 Summary email will not be sent, nor will any of today’s ach or check processing be performed until the problem with KFSXFPPD is resolved.

Janice.

For the Library feed, we had M in this table, I’ve changed it to MC.

John Hunter.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

12/16/09 Tue 06:32 See note from Janice below.

Error log and follow up comments:

One more production job which failed trying to connect to odsprod:

ERROR at line 1:

ORA-12541: TNS:no listener

ORA-06512: at line 250

6:32:30 250 select rtrim(FIRST_NAME), rtrim(LAST_NAME) , rtrim(MIDDLE_NAME), rtrim(SUFFIX_NAME)

06:32:30 251 ,WORK_PHONE, rtrim(EMAIL), DEPARTMENT_NUMBER, rtrim(ENAME), t1.csu_id

06:32:30 252 ,HR_EMPLOYEE_TYPE, WEID_EMPLOYEE_TYPE

06:32:30 253 INTO ODS_FIRST_NAME, ODS_LAST_NAME, ODS_MIDDLE_NAME, ODS_SUFFIX_NAME, ODS_WORK_PHONE

06:32:30 254 ,ODS_EMAIL, ODS_DEPARTMENT, ODS_ENAME, ODS_CSU_ID

06:32:30 255 ,ODS_HR_EMPLOYEE_TYPE, ODS_WEID_EMPLOYEE_TYPE

06:32:30 256 from csuf_employee_primary t1

06:32:30 257 where t1.employee_number = X.KFS_PRNCPL_ID

06:32:30 258 and rownum = 1;

Csuf_employee_primary is a view on kfsprd, selecting data from csuf_employee_primary@kfs_to_ods

Janice.

Aborted Module Name: ODSRAGEN.ODSRS001_02

Date: Day: Time: Resolution:

08/24/09 Mon 09:20 See Jan & Janice’s note below.

Error log and follow up comments:

+ /Applications Manager/exec/FILESIZE ODSRAGEN.ODSRS001_02.3221972.3221976.00.2009_08_24_0313.jobout 100

no output from ODSRAGEN.ODSRS001_02

+ err=100

+ date

+ echo exiting SQLP_CSU Mon Aug 24 03:13:53 MDT 2009

exiting SQLP_CSU Mon Aug 24 03:13:53 MDT 2009

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

ODSRAGEN.ODSRS001_02 is complete.

Jan.

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with:

Enter user-name: ERROR:

ORA-12541: TNS:no listener

Janice.

Aborted Module Name: ODSRAGEN.ODSRS002_01

Date: Day: Time: Resolution:

08/24/09 Mon 09:20 See Janice’s note below.

Error log and follow up comments:

ERROR at line 1:

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number with name "" too small

ORA-02063: preceding line from HR_ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2426

ORA-06512: at line 48

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

07:19:00 48 dbms_mview.refresh('CSUBAN.SPRIDEN_EMPN_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with:

Enter user-name: ERROR:

ORA-12541: TNS:no listener

Janice.

Aborted Module Name: ODSRAGEN.ODSRS002_01

Date: Day: Time: Resolution:

09/08/09 Tue 01:41 See note from Janice below.

01/22/14 Wed 02:04 Deleted by Dermot.

Error log and follow up comments:

09/08/09.

ORA-12008: error in materialized view refresh path

ORA-04052: error occurred when looking up remote object

ODS_USER.CSUH_AGEN_CSUID_V@HR_ODS_USER

ORA-00604: error occurred at recursive SQL level 2

ORA-12541: TNS:no listener

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2426

ORA-06512: at line 48

By the way, if AGEN refreshes are going against the HRMS database, there should be a discussion about whether dependencies need to be added to ensure that HRMS database updates have completed before the AGEN refreshes occur. All such cross system ODS refreshes should be identified and appropriate analysis regarding dependency connections should be performed.

It may even be desirable to separate out such cross system ODS refreshes into separate Applications Manager refresh chain(s) – for example if the AGEN refresh which utilizes HR were in a separate chain, then a failure due to HRMS being down (such as occurred last night) would not cause all the rest of the AGEN refreshes to be halted. Likewise, if the AGEN refresh which utilizes HR were placed into the HRMS refresh chain, then BANPROD being down could potentially cause all the rest of the HRMS refreshes to be halted. However, if such cross system refreshes were isolated into separate ODS refresh chain(s), then the impact of database(s) being down would be isolated only to the refreshes for the “down” database and those isolated cross system refreshes using that “down” database.

Janice.

01/22/14.

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 16 with name "_SYSSMU16_1294186362$" too small

ORA-02063: preceding line from BANPROD@ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

ORA-06512: at line 87

I'm not sure where to go with ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 16 with name "_SYSSMU16_1294186362$" too small

Should we just restart ODSRS002? Or do we need some DBA magic?

Vicki.

In the past we have found that restarting puts us back at the beginning of the process, which is unnecessary and can set the schedule back HOURS...

Steve. G.

The ODSRAGEN.ODSRS002_01 was on the last MV refresh -- I will run it and let you know when it is done.

The CSUBAN.CSUS_TEST_SCORES_MV failed again with the same error.

I think with Banner database being so busy with registration it just won't complete.

Do you want to abort it for today and try again tomorrow?

Let me take a look at the undo advisor and see if I can make some changes. This is caused because there are many updates going on at the same time something that runs a long time is querying the updated tables. I'd advise against running this during the day if it fails overnight.

Mark. P.

This might be relevant - Admissions is loading test scores in to Banner manually (via the API) every morning, for the most part. I confirmed that they loaded 654 rows into sortest this morning.

Kathy B.

Aborted Module Name: HRMSCPR_SAL_HRMSS063_01

Date: Day: Time: Resolution:

09/22/09 Tue 12:29 See note below.

Unable to open error module, to view the log one needs to go to the master module which should be in an INITIATED status.

Double click and you will get the screen which is displayed below from there you can view the output file to find the error.

Error log and follow up comments:

I could not open it to locate the error. Dawn.

The error can be found in the concurrent manager out file (o3877725.out), which can be viewed via Explorer window, Output Files tab for this failed chain component.

Error message from out file:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1062. Janice.

What is a concurrent manager? Dawn.

The HRMS programs which run via Applications Manager OAE (Oracle Apps Extension) are HRMS Concurrent Manager programs – i.e. programs defined to the HR Application Concurrent Manager Feature. While these concurrent manager programs can be scheduled within the HRMS application itself, we have chosen to instead schedule them via Applications Manager using the Applications Manager add-on OAE product. Basically, when one of these OAE type programs is a chain component, then Applications Manager will interface to the HRMS Concurrent Manager to submit the concurrent manager program to run. Likewise, upon completion (successful or not) of the HRMS Concurrent Manager program, the Applications Manager OAE interface retrieves output listings from the HR Application and presents them for viewing via the Applications Manager Explorer Window (Output Files Tab). This is similar to the interface with Banner programs in that the Banner log and lis files are available for viewing via Applications Manager Explorer Window (Output Files Tab).

Janice.

Aborted Chain Name: KFSXFPPC.SEND_MAIL_01

Date: Day: Time: Resolution:

06/15/11 Wed 15:05 Restarted by Steve Greene.

Error log and follow up comments:

You may have seen this abort (KFSXFPPC.SEND_MAIL_0) awhile ago. I noticed that the error was the same as one from a week ago on another chain:

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

at /appworx/csu/exec/SENDMAIL.PL line 792

error is 255

I still had an email from Janice about the previous abort, so I asked her if this was the same scenario. She said yes, but the conditions were simpler, and there was no need to redo or delete the conditions. She said the module could just be restarted, which I did. It finished successfully.

Stevie G.

Aborted Chain Name: WHRSL022.SQLLOAD-LOOP_01

Date: Day: Time: Resolution:

09/23/09 Wed 06:14 See note from Janice & Diane.

11/29/12 Thu 06:14 See note from Gudrun & Mark.

Error log and follow up comments:

09/23/09.

module is in LOADFAIL status:

The load for WHRS_CUR_FY_EXPHIST_00 on ODSPROD is failing due to a missing column in the table definition. The table definition will need to be fixed by a DBA before the failed production chain component can be restarted.

Janice.

SQL*Loader-466: Column SUBACCT does not exist in table "CSUHR"."WHRS_CUR_FY_EXPHIST_00".

Janice.

I think I got this incident late yesterday.

Diane.

11/29/12.

+ [ -f /app/oracle/product/11.2.0.3/bin/sqlldr ]

+ echo Could not find Sql Loader executable

Could not find Sql Loader executable

+ exit 2

ORACLE_HOME for odsprod points to /app/oracle/product/11.2.0.3/ in /etc/oratab on kebler. YET this Oracle Home DOES not have a sqlldr.

Please change in /etc/oratab the ORACLE_HOME to 11.1.0.7. It does have a sqlldr.

Otherwise if not possible we will have to conditionally set ORACLE_HOME, PATH and LIBPATH in our PREFIX.@odsprod script.

/etc/oratab

odsprod:/app/oracle/product/11.2.0.3:N

odstest:/app/oracle/product/11.2.0.3:N

odsdevl:/app/oracle/product/11.2.0.3:N

Once ORACLE_HOME issue is resolved APMX team can recover job.

Gudrun.

This is why I didn’t want to upgrade the client on Kebler. I will put it back the way it was.

I put the Oracle client back to 10.2.0.3 like it was prior to the install of the 11.2.0.3 client on Kebler. I also added the Oracle utilities to the 11.2.0.3 installation which is the actual correct solution to the problem. Would you like to use the old client or the new client for ODS?

Mark.

Aborted Module Name: KFSXPDSA.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/23/09 Wed 08:14 See note from Janice & Kevin.

Error log and follow up comments:

KFSXPDSA.KFSX_JAVA_01 (pdpSendAchAdviceNotificationsStep) failed.

As a reminder, the daily PDP check cycle will not proceed until this failed chain component is either deleted or successfully completes.

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <www.medsourceinc.us.com>... User unknown

at org.springframework.mail.javamail.JavaMailSenderImpl.doSend(JavaMailSenderImpl.java:407)

at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:298)

at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:284)

at org.kuali.rice.kns.service.impl.MailServiceImpl.sendMessage(MailServiceImpl.java:60)

... 6 more

2009-09-23 08:13:10,878 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3350895.3350924.00 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-23 08:13:10,878 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Janice.

The offending email address has been set to BFS_AcctPay@mail.colostate.edu. The job can be reran.

Kevin.

Aborted Module Name: KFSXGLEF_D2.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/24/09 Thu 22:01 See note below from Janice & Kevin.

Error log and follow up comments:

ERROR: Exception caught:

Caused by: java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Same error occurred in KFSXAPIM.KFSX_JAVA_01.

Shawn,

Can you add $LIB_KFS/groovy-all-1.6.4.jar:\ to kfsdevl_Applications Manager.env

Kevin.

Can a DBA please follow-up on this?

Janice.

Aborted Module Name: HRMSCHK_QPH.CHECK_WRITER_02

Date: Day: Time: Resolution:

09/25/09 Fri 10:30 See note from Janice.

Error log and follow up comments:

The following module is in DB ERROR status:

HRMSCHK_QPH.CHECK_WRITER_02

Occasionally, we encounter a problem where Applications Manager thinks that there is not enough room in the SO_LOG column of the SO_JOB_QUEUE table to store the information Applications Manager is logging related to condition actions. Usually the RmiServer log provides a hint that this is the problem with an error like this:

ORA-12899: value too large for column – and references the SO_LOG column.

This time the error message said the problem was with the SO_REQUEST_DATE column which doesn’t really make any sense at all. At any rate, the result is still the same – the module goes into DBERROR status while trying to perform the BEFORE conditions associated with the module. In the particular case of the CHECK_WRITER component on which today’s DBERROR occurred, it is even more confusing because the previous execution of this same module with exactly the same conditions just moments before finished just fine!

To fix this problem, replace the contents of the SO_LOG column in the SO_JOB_QUEUE table for this specific jobid with something small (or just a null value).

For example:

# export ORACLE_SID=awprod

# export TWO_TASK=awprod

# sqlplus Applications Manager

SQL*Plus: Release 10.2.0.3.0 - Production on Fri Sep 25 10:40:35 2009

Enter password:

Connected to:

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

10:40:39 SQL> update so_job_queue

10:40:51 2 set so_log = 'DB ERROR(QUEUED) 2009-09-25 10:15:26'

10:43:09 3 where so_jobid= '3361819'

10:43:30 4 /

1 row updated.

10:43:36 SQL> commit

10:43:43 2 /

Commit complete.

If you don’t have the Applications Manager password, Greg or Rich could perform this sql update.

Then the job can be restarted. Of course, each situation is different and the associated CONDITIONS would need to be carefully examined to determine how/if it can be restarted and whether any conditions have already been tagged as DONE that would need to be reactivated, etc.

Let me know if you have questions – I don’t want to be the only one who knows how to fix this!!

Another method to help identify this problem is to view the Operator Log (via Applications Manager Explorer, Operator Log Tab) for the chain component with the DBERROR. If the log display is very long and appears to possibly be truncated at the end – then the “value too large for column” may have occurred for the SO_LOG column. By the way, I would report this problem to UC4 but they would probably tell us to upgrade J and/or not be able to reproduce the problem. It doesn’t happen very often – but in the past, it’s basically been impossible to get the component restarted without manually updating the SO_LOG column to allow room for Applications Manager to log activity associated with that chain component.

Janice.

Aborted Module Name: ADMSPROS.SRTLOAD_01

Date: Day: Time: Resolution:

09/29/09 Tue 10:30 See note from Janice below.

Error log and follow up comments:

this_report_title=.Electronic_Prospect_Load

+ eprint_report=ADMSPROS.SRTLOAD_01.Electronic_Prospect_Load

+ [[ -s /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis ]]

+ [[ ! -s /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis ]]

+ print -n -- \n\fREPORT : ADMSPROS.SRTLOAD_01.Electronic_Prospect_Load\n\f

+ 1>> /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis

+ cat /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

+ 1>> /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis

+ [ 139 -eq 0 ]

For future reference, it would be helpful if IT Scheduling can include the type of feedback which I’ve sent this morning for the other SRTLOAD aborts and include the functional analyst as an email recipient. Note the correct excerpts from the various logs shown below for this latest ADMSPROS.SRTLOAD abort.

From the Banner log (ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.log):

file name /ais01/dat/work/prod/d3370022

Missing DOB field for record with name of

Rozana Beluts and SSN of

Missing DOB field for record with name of

Grant Hinkle and SSN of

Missing DOB field for record with name of

Kevin Nguyen and SSN of

Missing DOB field for record with name of

Ashley Sharpe and SSN of

Missing DOB field for record with name of

Bobby Torandaz and SSN of

Missing DOB field for record with name of

Michael Venter and SSN of

Missing DOB field for record with name of

Hamel Winter and SSN of

srtload completed successfully

609 lines written to /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

TOTAL EXECUTION TIME IN SECONDS: 22

TOTAL EXECUTION TIME IN MINUTES: 0.367

From the Banner lis (ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis)

Number of Records Read from Tape: 216

Total of Prospects Loaded : 216

Total of PIDMs Matched : 64

Total of Conversion Errors : 7

From the joblog (ADMSPROS.SRTLOAD_01.3370022.3370029.00.2009_09_29_1029.AWPROD.LOG):

+ /app/sct/banprod/general/exe/srtload -f -o /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

+ cat /Applications Manager/run/temppar.4708342

+ 1>> /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.log 2>& 1

/Applications Manager/exec/AW_BANNER[414]: 770394 Memory fault(coredump)

+ err=139

+ rm -f /Applications Manager/run/temppar.4708342

+ 1> /dev/null 2>& 1

+ [ 139 != 0 ]

+ echo Non-zero error generated from running job. The program srtload failed to run successfully.

Non-zero error generated from running job. The program srtload failed to run successfully.

Janice.

Aborted Module Name: ADMSAPPL.ADMSS484_01

Date: Day: Time: Resolution:

12/13/10 Mon 22:12 Restarted by ITS.

Error log and follow up comments:

+ print *** \n*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ egrep -v -f /ais01/dat/misc/prod/errstrg_sql_ORA_ok

+ egrep -f /ais01/dat/misc/prod/errstrg_sql /appworx/out/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212.AWPROD.LOG

+ 1>> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS \n***

+ 1>> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ cat /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 660

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

It is helpful to have some of the information before the actual error. In this case, I went back to the file in /ais01/joblog to get the output before the error. This tells me the person the sql was processing when the error occurred. So, this portion of the file:

Aidm 476631

sarrqst_cnt: 14

Before residency

Not all 3 Residency questions answered 'Y'

res_code: 0 res_claim: Y res_nonzip: res_zip: 80919 Application level is UG Major to be inserted ELEG-BMEE-BS declare

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 660

Is more useful to us than just the error and the line number.

Bev.

I believe that the issue related to the data is solved. Please run the chain again.

Rami.

Aborted Module Name: KFSXPDSA.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/30/09 Wed 08:36 See notes from Janice below.

Error log and follow up comments:

Kevin is already working on fixing the email address problem and will let us know when the failed component can be restarted.

Pertinent error messages from the joblog:

009-09-30 08:10:03,509 [main] INFO org.kuali.kfs.sys.batch.Job :: Executing step: pdpSendAchAdviceNotificationsStep=class org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep

2009-09-30 08:10:53,927 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <anne.hanika@colostate.edu>... User unknown

... 6 more

2009-09-30 08:10:53,966 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.00 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-30 08:10:53,966 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67) Janice.

This module aborted again. Dawn.

Another bad email address so it failed again:

2009-09-30 09:01:38,384 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <nicole.brennan@colostate.edu>... User unknown

2009-09-30 09:01:38,429 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.01 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-30 09:01:38,429 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67) Janice.

Another email showing the actual error message(s) for this failed component – IT Scheduling should include these type of error messages in their APPLICATIONS MANAGER-ABORT email(s), if possible . Janice.

Just an F.Y.I. –- I went into the output file and was doing a “Find” looking for the errors Janice is showing in her email below, and Applications Manager froze up on me before I got to the errors she mentions. This happened twice, and each time I had to close Applications Manager through Task Manager and re-login. I finally found the errors she mentions by looking at the output file in /ais01/joblog in kebler using spf. Steve.

That’s interesting – I was able to view those errors (and copy them) via the Applications Manager Output Files Viewer – ugh, just what we need – another unexplainable Applications Manager problem! Instead of doing a “Find”, you might try just scrolling down through the output file – that’s how I got to the error messages, rather than “Find”ing them. Janice.

Aborted Module Name: ADMSGREL.SRTLOAD_01

Date: Day: Time: Resolution:

10/01/09 Thu 09:35 See follow up from Bev below.

Error log and follow up comments:

Parameter 01 /ais01/dat/work/prod/d3385536 Read from Job Submission

Parameter 02 GRE Read from Job Submission

Parameter 05 G Read from Job Submission

Parameter 15 GRE Read from Job Submission

Parameter 22 U Read from Job Submission

PREL Code=GRE and TAPE=GRE Interface Code GRE Contact Code XTS Source A00005

MAJOR2INTEREST Function N

Valid Tape Code=GRE GRE Test Score Tape

Parameter 07 GR Read from Job Submission

Parameter 99 55 Read from Job Submission

Parameter 03 was not found in Job Submission

Parameter 04 was not found in Job Submission

Parameter 06 was not found in Job Submission

Parameter 08 M Read from Job Submission

Parameter 09 was not found in Job Submission

Parameter 10 was not found in Job Submission

Parameter 11 was not found in Job Submission

Parameter 12 XTS Read from Job Submission

Parameter 13 was not found in Job Submission

Parameter 14 MA Read from Job Submission

Parameter 16 A Read from Job Submission

Parameter 17 EADM Read from Job Submission

Parameter 18 N Read from Job Submission

Parameter 19 was not found in Job Submission

Parameter 20 N Read from Job Submission

Parameter 21 Y Read from Job Submission

Parameter 23 was not found in Job Submission

Parameter 24 was not found in Job Submission

file name /ais01/dat/work/prod/d3385536

*ERROR* INTERNAL FIELD TABLE SRTTPFD_ROW SIZED FOR 1000 FIELDS.

RESIZE STRUCT SRTTPFD_ROW TO ALLOW MORE ENTRIES.

srtload terminated with error

121 lines written to /Applications Manager/out/ADMSGREL.SRTLOAD_01.3385536.3385539.00.1683644.lis

I found the exact error on UDC as a Defect. I logged a critical Service Request with Sungard, because the solution is part of 8.3.

I received a call from Sungard. It has been escalated. We are the third school to encounter this problem.

For now, we have no resolution. ………Bev

This problem is unique to GRE loads. The other SRLOADS appear to be fixed with the patch Mark applied.

Not running ADMSGREL until there’s a solution would be good……………Bev

Marcella and I tested this (SRTLOAD.pc) with two different loads and it now appears to be working correctly.

Bev.

Hi Robin, this issue was resolved on Monday. It is ok to run GRE jobs in Applications Manager now.

Marcella.

Aborted Module Name: HRMSFLX_SAL.HRMSS138_01

Date: Day: Time: Resolution:

10/23/09 Fri 11:45 See note from Janice below..

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-01821: date format not recognized

ORA-06512: at line 287

11:40:54 287 Select substr(address_line1, 1, 30)

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

Another bad email address so it failed again:

2009-09-30 09:01:38,384 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <nicole.brennan@colostate.edu>... User unknown

;

2009-09-30 09:01:38,429 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.01 steps: [pdpSendAchAdviceNotificationsStep]

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

HRMSFLX_SAL.HRMSS138_01 is complete.

David.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

10/27/09 Tue 19:26 Restarted by Janice.

Error log and follow up comments:

Janice,

Debbie asked me to talk to you about this error. Three of us looked for the error and we did not find what you found. Do you have time to show me how you found this error?

I usually open the output listing and go to the bottom, then just start scrolling up until you see something like this:

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

Then keep scrolling up a bit and you’ll see the error and section of the output log which I emailed. I usually just do the scrolling, rather than searching for a word like “error”. In the past, it seems that Applications Manager hanging might be related to searching output files – maybe it’s just coincidence, but I can recall a few times when IT Scheduling was doing the find command in an output file and Applications Manager became non-responsive – so I usually just scroll through the listing instead.

2009-11-03 19:27:10,511 [main] ERROR org.kuali.rice.kew.util.XmlHelper :: Error accessing method 'getGroup' of instance of class org.kuali.rice.kew.actionitem.ActionItem

2009-11-03 19:27:10,624 [main] INFO org.kuali.rice.kew.engine.StandardWorkflowEngine :: Successfully processed document: 482781 : null

2009-11-03 19:27:17,732 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.3527835.3527843.00 steps: [procurementCardRouteDocumentsStep]

2009-11-03 19:27:17,732 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

at org.objectweb.jotm.Current.commit(Current.java:488)

at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:311)

at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:117)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:203)

at $Proxy172.routeProcurementCardDocuments(Unknown Source)

at org.kuali.kfs.fp.batch.ProcurementCardRouteDocumentsStep.execute(ProcurementCardRouteDocumentsStep.java:42)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

Please run KFSXFPPC…………..Kevin.

I restarted the failed KFSXFPPC.KFSX_JAVA_03 chain component……….Janice.

Aborted Module Name: LAUNCH/DB ERRORS

Date: Day: Time: Resolution:

11/03/09 Tue 10:00-1230 See note from Janice below.

Error log and follow up comments:

In the past, the sole purpose of the APWXCHCK_HOURLY_SYSTEM_CHECK chain was to "touch" /ais01/dat/apwx/prod/APWXCHCK_HOURLY.DAT, the Kebler file which the Enterprise Manager process monitors to determine if Applications Manager may have stalled and needs to be recycled. Although the majority of LAUNCH and DB ERRORS are connected to java hanging/Applications Manager problems, we occasionally encounter these errors for other reasons.

The /ais01/dat/apwx/prod/APWXCHCK_HOURLY.DAT file, coupled with the Enterprise Manager process, will not detect such errors and therefore a separate process was developed. While there may be some duplication of reporting when LAUNCH/DB ERRORS are related to java hanging/Applications Manager problems, the separate process to check backlog for such errors is the only way to detect and report on *ALL* LAUNCH/DB ERRORS.

Below please find a sample of the email which will be sent from the APWXCHCK_HOURLY_SYSTEM_CHECK chain if any chain components are found in backlog in either LAUNCH ERROR or DB ERROR status. The process to check backlog is performed via the APWXCHK_BACKLOG module, which runs the underlying APWXCHK_BACKLOG report, which I created for this purpose via the Applications Manager Reports feature. Via an AFTER condition of the APWXCHK_BACKLOG module, the SEND_MAIL module will be requested to email the APWXCHK_BACKLOG report **if** the size of the output report file is indicative that detail records are present on the report. The recipients for this email are contained within the mailing list file, /ais01/dat/misc/mailst/SEND_MAIL.CRITFAIL.LST, which currently contains:

970-226-7550@PAGE.METROCALL.COM

Jan.Mueller@ColoState.EDU

Janice.Wilkinson@ColoState.EDU

David.Peterson@ColoState.EDU

IT_scheduling@mailer.is.colostate.edu

The same mailing list file, /ais01/dat/misc/mailst/SEND_MAIL.CRITFAIL.LST, is also utilized for the SEND_MAIL component which is spawned via the COMPLETION script for Critical Chain Component Aborts.

-----Original Message-----

From: Applications Manager@Kebler.is.colostate.edu [mailto:Applications Manager@Kebler.is.colostate.edu]

Sent: Tuesday, November 03, 2009 11:16 AM

Cc: Wilkinson,Janice

Subject: LAUNCH/DB ERRORS

Tue Nov 03 11:15:35 MST 2009 Page 1

Check Backlog for Launch or DB Errors

Status Name Module Jobid

------------ ---------------------------- ----------

DB ERROR APWXTST0_EX.WAIT_FOR_COND_01 3526200

LAUNCH ERROR APWXTST0_EX.PAYRPNAC_01 3526201.01

P.S. Just a reminder that the Enterprise Manager process will continue to send the page to Greg's pager number when Applications Manager has been recycled.

Janice.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

11/04/09 Wed 19:27 See note from Janice below.

Error log and follow up comments:

Janice,

Debbie asked me to talk to you about this error. Three of us looked for the error and we did not find what you found. Do you have time to show me how you found this error?

I usually open the output listing and go to the bottom, then just start scrolling up until you see something like this:

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

2009-11-03 19:27:10,511 [main] ERROR org.kuali.rice.kew.util.XmlHelper :: Error accessing method 'getGroup' of instance of class org.kuali.rice.kew.actionitem.ActionItem

2009-11-03 19:27:10,624 [main] INFO org.kuali.rice.kew.engine.StandardWorkflowEngine :: Successfully processed document: 482781 : null

2009-11-03 19:27:17,732 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.3527835.3527843.00 steps: [procurementCardRouteDocumentsStep]

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

at org.objectweb.jotm.Current.commit(Current.java:488)

at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:311)

at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:117)

at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:203)

at $Proxy172.routeProcurementCardDocuments(Unknown Source)

at org.kuali.kfs.fp.batch.ProcurementCardRouteDocumentsStep.execute(ProcurementCardRouteDocumentsStep.java:42)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

Please run KFSXFPPC…………..Kevin.

I restarted the failed KFSXFPPC.KFSX_JAVA_03 chain component……….Janice.

Aborted Module Name: HRMSDED_SAL.HRMSRPTS-LOOP_01

Date: Day: Time: Resolution:

11/19/09 Thu 13:12 See follow up notes below.

Error log and follow up comments:

+ . /Applications Manager/exec/COMPLETION

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 3597011

3597011.00 BATCH HRMSDED_SAL.HRMSR00211/19 13:24 00:20:35 C-Error APPLICATIONS MANAGER HRMSDED_DEDUCTION_REPORTS

+ print Failure in spawned HRMSR002 - abort HRMSRPTS-LOOP

Failure in spawned HRMSR002 - abort HRMSRPTS-LOOP

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

Debra called me on this failure, so I talked with Steve Hill. Steve was unaware of this failure until he received a Clarity task yesterday PM to fix a bug in the report. The report was modified and is waiting on Chris Domanik in HR to validate the report has been corrected. Once corrected, he will generate a turnover and let ITS know that the job can be restarted. Janice or Jan may be able to weigh in on the restart of this process..……….Ken

As Ken indicated in his email, the proposed solution for the failed HRMSR002 is waiting for approval from Chris. There are a total of 20 entries in the salary report driver file, /ais01/dat/hrms/prod/HRMSRPTS_SALARY_DRIVER. Two of those entries are for HRMSR002 – one to produce a download file and another to produce hard-copy report. While we wait for approval regarding the HRMSR002 solution, I’ve moved the two HRMSR002 entries to the end of the driver file so we can proceed with other reports that are to run and I’ve restarted the failed HRMSDED_SAL.HRMSRPTS-LOOP_01. When the HRMSRPTS-LOOP reaches the HRMSR002 entries in the driver file, the spawned HRMSR002 will fail again unless the solution has already been implemented into production. If HRMSR002 and/or any other of the spawned reports fail, please be sure that notification is sent to Steve Hill in addition to the apwx_maint mailing list.

Janice.

Aborted Module Name: KFSXOFF_D1.KFSX_OFFLINE_01

Date: Day: Time: Resolution:

11/19/09 Thu 19:10 Restarted by Janice.

Error log and follow up comments:

19 19:10:51-Child:Done

19 19:10:55-Parent: (4)Checking child process(413908)

19 19:10:55-Parent: Child process[413908] done.

19 19:10:55-Parent: Checking child mem

19 19:10:55-Parent: Value in mem [1]

19 19:10:55-Parent: Child process returned a value.

19 19:10:55-Parent: child process done.

19 19:10:55-Parent:Value in mem [1]

19 19:10:55-Deleting kill file if exists [/Applications Manager/run/jobpid.3599633.00]

19 19:10:55-Deleting flag file if exists [/Applications Manager/run/jobpid.3599633.00]

19 19:10:55-Getting env 'SURUNEXIT'

19 19:10:55-Null

Exiting with su job error code[1].

error is 1

Last night’s KFSX schedule stalled due to an abort in the KFSXOFF.KFSX_OFFLINE_01 component of the KFSXOFF_D1_ONLINE_OFF_JOB1 chain. The KFSX offline/online processes perform remote shell execution (on Malta) of the scripts to shutdown Tomcat/startup Tomcat respectively. The logs(feedback) from the remote shell executions are written to the /ais02/log directory on a NSF mounted volume to allow the controlling KFSX_OFFLINE and KFSX_ONLINE components, which run on Kebler, to examine the logs and determine if the remote shell processes were successful. Occasionally, with NSF mounted volumes, we have experienced a delay in availability of a file back on Kebler – resulting in either an empty file or partially complete file being examined rather than the complete file. Primarily, we’ve seen this type of delay with utl files from sql executions and until last night had not experienced this type of delay for remote shell execution logs. While the remote shell execution to shutdown Tomcat actually was successful last night, an incomplete version of the associated log file was examined by the controlling KFSX_OFFLINE script on Kebler. Consequently, it appeared to the KFSX_OFFLINE script that the Tomcat shutdown was NOT successful, which caused the KFSXOFF.KFSX_OFFLINE_01 component to abort.

To prevent a reoccurrence of this situation in the future, I have made the following changes:

· Added a sleep command in the KFSX_OFFLINE.KSH and KFSX_ONLINE.KSH scripts to introduce a delay between the rsh (remote shell execution) command and the subsequent process which attempts to examine the remote shell log file on the NSF mounted volume /ais02/log directory.

· The following KFSX offline/online chains are now critical chains so any failures within these chains will result in a page to IT Scheduling Oncall Staff, who will then contact the appropriate IS staff to follow-up:

KFSXOFF_D1_ONLINE_OFF_JOB1 KFSX Daily Online Off Job #1 (Shutdown Tomcat)

KFSXOFF_D2_ONLINE_OFF_JOB2 KFSX Daily Online Off Job #2 (Shutdown Tomcat)

KFSXON_D1_ONLINE_ON_JOB1 KFSX Daily Online On Job #1(Startup Tomcat)

KFSXON_D2_ONLINE_ON_JOB2 KFSX Daily Online On Job #2(Startup Tomcat)

Janice.

Aborted Module Name: KFSXFPPC_FP_PCARD_DOCUMENTS

Date: Day: Time: Resolution:

11/30/09 Mon 09:00 Restarted by Jan.

Error log and follow up comments:

org.kuali.rice.kew.actionitem.ActionItem

2009-11-24 19:18:17,384 [main] ERROR org.kuali.rice.kew.mail.service.impl.ActionListEmailServiceImpl :: Error sending Action Li

st email.

org.kuali.rice.kew.exception.WorkflowRuntimeException: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Alissa.Gigliotti@colostate.edu>... User unknown

at org.kuali.rice.kew.mail.service.impl.DefaultEmailService.sendEmail(DefaultEmailService.java:73)

at sun.reflect.GeneratedMethodAccessor536.invoke(Unknown Source)

I updated the email to akgigliotti@hotmail.com

Could you rerun the job?

John Hunter.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

11/30/09 Mon 20:16 Restarted by David.

Error log and follow up comments:

20:15:37 534

20:15:37 535 if l_run_date between c2.cvg_strt_dt and c2.cvg_thru_dt then

20:15:37 536 -- dbms_output.put_line(c1.person_id||','|| c1.prtt_enrt_rslt_id); --used for error testing

20:15:37 537 -- Member Level Detail segment

20:15:37 538 l_segment := csuh_edi_834_pkg.edi_ins(

20:15:37 539 p_ins01 => 'N'

20:15:37 540 ,p_ins02 => c2.contact_type_code

20:15:37 541 ,p_ins03 => '030'

20:15:37 542 ,p_ins04 => '20'

20:15:37 543 ,p_ins05 => 'A');

20:15:37 544 write_segment(ws_file_handle, l_segment,l_segment_count);

20:15:37 545

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 538

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

This failed because there are several records out there again that have “BEN” as contact type that shouldn’t.

Jan.

Chris,

Aren’t your people running that script daily and fixing up the records?

-Bob-

I’ve included HRMSS041 as an exception so the HRMSAW99 should finish in about 5 minutes when it wakes up again to check backlog. Once that has completed, IT Scheduling may release staging which will allow HRMSAW15 to be staged in. HRMSS041 has no successors, so if we need to let it hang until tomorrow, that shouldn’t cause any problems with tonight’s HRMS schedule.

Janice.

HRMSS041 is complete.

David.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_01

Date: Day: Time: Resolution:

12/03/09 Thu 15:03 Restarted by ITS.

Error log and follow up comments:

2009-12-03 15:03:32,138 [main] INFO edu.csu.batch.service.RunBatch :: Executing job: KFSXFPPC.procurementCardLoadStep.3650673.3650674.00 steps: [procurementCardLoadStep]

2009-12-03 15:03:32,201 [main] INFO org.kuali.kfs.sys.batch.Job :: Started processing step: 0=procurementCardLoadStep

2009-12-03 15:03:32,209 [main] INFO org.kuali.kfs.sys.batch.Job :: Creating user session for step: procurementCardLoadStep=kfs

2009-12-03 15:03:32,395 [main] INFO org.kuali.kfs.sys.batch.Job :: Executing step: procurementCardLoadStep=class org.kuali.kfs.fp.batch.ProcurementCardLoadStep

2009-12-03 15:03:33,073 [main] ERROR org.kuali.kfs.sys.exception.XmlErrorHandler :: error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

2009-12-03 15:03:33,157 [main] ERROR org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl :: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

2009-12-03 15:03:33,160 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardLoadStep.3650673.3650674.00 steps: [procurementCardLoadStep]

2009-12-03 15:03:33,160 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:75)

at org.kuali.kfs.fp.batch.ProcurementCardLoadStep.execute(ProcurementCardLoadStep.java:54)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Caused by: org.kuali.kfs.sys.exception.XMLParseException: error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

at org.kuali.kfs.sys.exception.XmlErrorHandler.error(XmlErrorHandler.java:42)

at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source)

David.

The last pcard transaction in the input file was missing the Chart Of Accounts code of CO for account 1306230. I updated this, so can someone please re-run. I’ll contact John Swaro to update the default COA.

John Walker.

Aborted Module Name: EIDSUPDT.EIDSS002_01

Date: Day: Time: Resolution:

06/14/11 Tue 22:38 Restarted by Joleen.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 23

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

I'm forwarding this to Phil since neither of the individuals on the EIDS Alert list are here today!!

Janice.

I couldn't figure out any obvious cause for the numeric value error. Please try to run this again, and if we still have the error, I will update the program to output some data.

Rami.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

12/10/09 Thu 14:06 Restarted.

Error log and follow up comments:

2009-12-10 14:06:00,421 [main] INFO org.kuali.rice.kew.docsearch.SearchableAttribute :: ...finished indexing

document 523858 for document search.

2009-12-10 14:06:00,616 [main] ERROR org.apache.ojb.broker.accesslayer.JdbcAccessImpl ::

* SQLException during execution of sql-statement:

* sql statement was 'INSERT INTO PDP_PMT_NTE_TXT_T (PMT_NTE_ID,CUST_NTE_LN_NBR,CUST_NTE_TXT,LST_UPDT_TS,VER_NBR,PMT_DTL_ID,OBJ_ID) VALUES (?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.kfs.pdp.businessobject.PaymentNoteText'

* PK of the target object is [id=10071760]

* Source object: paymentNoteText(id)=(10071760)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:331)

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:288)

at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:745)

at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:216)

at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:966)

Shawn,

A job failed due to a crazy character in the DV check stub field for 2 DV docs. Can you run the attached script in kfsprd to correct (remove) the crazy characters?

Please let David/Apwx_maint know when you are done, so that they can re-run the job.

John Walker.

Aborted Module Name: HRMSDAY1.HRMSS007_01

Date: Day: Time: Resolution:

12/16/09 Wed 07:15 See note from Janice below.

09/15/10 Wed 20:20 Restarted by ITS.

Error log and follow up comments:

12/16/09.

Both HRMSS007 and HRMSS009 failed with:

ERROR at line 1:

ORA-12541: TNS:no listener

which I believe was caused by problems with ODS or this link to ODS. Both of these sqls use the csug_gp_demo_v view, which selects data from csuban.csug_gp_demo@odsprod.world.

When I attempt via an sql (on hrprod) to just select count using this link to odsprod, the same TNS no listener error is produced:

07:20:07 SQL> select count(*) from csuban.csug_gp_demo@odsprod.world

07:22:49 2 /

select count(*) from csuban.csug_gp_demo@odsprod.world

ERROR at line 1:

ORA-12541: TNS:no listener

09/15/10.

ERROR at line 23:

ORA-06550: line 23, column 17:

PL/SQL: ORA-00904: "CSUG_GP"."MULTI_RACE_IND": invalid identifier

ORA-06550: line 8, column 1:

PL/SQL: SQL Statement ignored

ORA-06550: line 184, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 184, column 4:

PL/SQL: Statement ignored

ORA-06550: line 185, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 185, column 4:

PL/SQL: Statement ignored

ORA-06550: line 186, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 186, column 4:

PL/SQL: Statement ignored

ORA-06550: line 192, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

This process can be restarted. I restored the old script and put it in the /ais01/src/sql/temp directory. This file can be flagged to stay in this directory for 1 week. If I need to contact someone else concerning the file in the temp directory please let me know.

Steve H.

Aborted Module Name: KFSXGLPO_D1.KFSX_JAVA_02

Date: Day: Time: Resolution:

12/15/09 Tue 19:50 Restarted by Jan.

Error log and follow up comments:

java.lang.RuntimeException: PosterServiceImpl Stopped: AbstractUpdatingPreparedStatementCachingDaoJdbc.UpdatingJdbcWrapper encountered exception during getObject method for type: class org.kuali.kfs.gl.businessobject.Entry

at org.kuali.kfs.gl.batch.service.impl.PosterServiceImpl.postTransaction(PosterServiceImpl.java:460)

at org.enhydra.jdbc.core.CorePreparedStatement.executeUpdate(CorePreparedStatement.java:102)

at org.kuali.kfs.sys.batch.dataaccess.impl.AbstractPreparedStatementCachingDaoJdbc$JdbcWrapper.update(AbstractPreparedStatementCachingDaoJdbc.java:39)

... 87 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

KFSX Follow-up Summary:

Since the KFSXGLPO_D1.KFSX_JAVA_02 failure was within the KFSX Online off (Tomcat down) window, Tomcat should have remained down until all processing scheduled to occur within the KFSX Online off window had completed this morning. Although I was not directly informed of any decision to bring Tomcat up (KFSX Online access back on), I discovered that indeed it was manually started this morning thereby allowing users online access to KFS. This posed a dilemma as we still had not only the KFSXGL_D1_DAILY_GL_UPDT_JOB1 update chain (which had the KFSXGLPO_D1.KFSX_JAVA_02 failure) which needed to complete, but also several other chains – all of which SHOULD be running with Tomcat down (i.e. no KFSX users on the system). Additionally, the 2^nd GL update chain, KFSXGL_D2_DAILY_GL_UPDT_JOB2, to post encumbrances had not yet run. If we allowed encumbrance posting process to run as per design, it would shutdown Tomcat, run the scrubber and poster subchains, and then bring Tomcat back up.

Although it obviously would have been best to not bring Tomcat up this morning, we already had users back in the system, creating documents, etc. and we had to proceed by determining how to minimize the adverse impact. Kevin and I discussed which chains were left to run and decided that although not the ideal scenario, the adverse impact would be less by leaving Tomcat up and bypassing the encumbrance updates. To accomplish that, the following actions were taken:

1) Deleted KFSXCS20.KFSXS001_01 (so encumbrance feed for KFSXGL_D2 would not be created)

2) Deleted KFSXCS20.NOTIFY_FOR_APWX_01 (so notify for KFSXGL_D2 would not be created).

3) Placed a hold on KFSXGLEF_D2.KFSX_JAVA_01 so that output from the KFSXGLEF_D2.COLLECT_FILES_01 component could be checked to verify that no data was collected. Once that verification was performed, this chain component was released to run.

In the future, when there are production job aborts within the Tomcat down (KFSX Online Access Off) window, Tomcat will be down in the morning and should remain down until the problems are resolved. In today’s case, we were lucky that the update chain failed far enough into the process that having users back online was not as problematic. However, had the failure been earlier within the chain, that may not have been the case. It is extremely important that consultation regarding status of KFSX Applications Manager production schedule occur prior to making decisions regarding whether Tomcat should be manually started. Let me know if you have questions.

Janice.

We did have a document go into exception this morning due to a user updating a PO (entering receiving information ?) at the same time the Auto Close PO job was updating the PO. John H was able to approve the document.

Kevin.

Aborted Module Name: APMXLOOK_AM.APMXLOOK_01

Date: Day: Time: Resolution:

06/04/10 Fri 08:18 Restarted by Jan.

Error log and follow up comments:

+ 1>> /ais01/dat/work/prod/APWXLOOK_AM.APWXLOOK_01_jobstat

+ cat /ais01/dat/work/prod/APWXLOOK_AM.APWXLOOK_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

grep: 0652-033 Cannot open /ais01/src/sql/temp/APWXLOOK_OK.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

I created /ais01/src/sql/temp/APWXLOOK_OK and restarted APWXLOOK_AM.APWXLOOK_01, complete.

Jan

Aborted Module Name: FAIDEPLS_EV.LYNX_01

Date: Day: Time: Resolution:

07/10/13 Wed 15:01 Restarted by Joleen.

Error log and follow up comments:

[OracleException]: ORA-20100: *Error* in call to rp_award.p_update: ORA-06502: PL/SQL: numeric or value error: NULL index table key value

ORA-06512: at "BANINST1.RP_AWARD", line 1840

ORA-06512: at "BANINST1.RP_AWARD", line 1895

Karma asked me to bypass the aborted LYNX step and let the rest of the process flow run. FAIDEPLS_E-PLUS has finished running.

Joleen.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_01

Date: Day: Time: Resolution:

12/28/09 Mon 15:12 Restarted by ITS.

Error log and follow up comments:

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 3209, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:75)

at org.kuali.kfs.fp.batch.ProcurementCardLoadStep.execute(ProcurementCardLoadStep.java:54)

at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Caused by: org.kuali.kfs.sys.exception.XMLParseException: error Parsing error was encountered on line 3209, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

Missing a chart code in KFSXFPPC.KFSXS008_01.utl_file2, KFSXFPPC.FCS3571B_20091223_143413SB_000350.cdf.xml. I entered CO. Please restart this step/job.

Kevin.

Aborted Module Name: OSYSJOBS_04.OSYSPURG_01

Date: Day: Time: Resolution:

03/05/10 Fri 16:31 Restarted by Jan.

Error log and follow up comments:

** REMOVE log FILES OLDER THAN 30 DAYS

***

<#/ais02/job/temp/sys_purg_rsh.ksh.805#> cat /ais02/dat/work/prod/OSYSJOBS_04.OSYSPURG_01.4028488.4028492.00_too_old

<#/ais02/job/temp/sys_purg_rsh.ksh.805#> xargs -n25 rm -ef

rm: Removing ./access_log.1265068800

rm: Removing ./error_log.1265068800

rm: Removing ./ssl_request_log.1265068800

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

It is difficult finding errors in all the output, I had a tuff time myself.

But this is why the OSYSJOBS_04.OSYSPURG_01 jobs failed.

The alm_orautl filesystem was not mounted on Kebler for some reason and it could not perform the find command. I mounted the filesystem.

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> find /alm_orautl/hrdevl/ -type f -mtime +7 -print

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> 1>> /ais02/dat/work/prod/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00_too_old

find: 0652-010 The starting directory is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.78#> [[ 1 > 0 ]]

<#errtrap_rsh.78#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_rsh.7>> exit 1

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

rm: Removing /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 \n*** EXIT WITH EXIT CODE=1 \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1

*** EXIT WITH EXIT CODE=1

Rich.

Aborted Module Name: KFSXON_D1

Date: Day: Time: Resolution:

01/04/10 Mon 13:30 See note from Janice below.

Error log and follow up comments:

KFSXON_D1 is failing with the permission error below:

+ rsh_logname=/ais02/log/KFSXON_D1.KFSX_ONLINE_01.3761702.3761703.00.2010_01_04_1329.log

+ rsh Malta2 -l jobsys /ais02/job/prod/kshexe_rsh /usr/local/bin/startup_kfs_tomcat KFSXON_D1.KFSX_ONLINE_01.3761702.3761703.00 2010_01_04_1329 prd

rshd: 0826-813 Permission is denied.

+ exit 1

Child: Job return = 1

David.

As per the recommendation that we should be using the more secure hostname of Malta2, rather than Malta, I modified the production KFSX Tomcat startup/shutdown chain components to use Malta2. I tested the change on AWTEST, with Guffey2, which worked fine. However, it was critical that the connection to Malta2 be tested before tonight's production so I also ran this KFSX_ONLINE test on AWPROD. By the way, it is safe to test bringing Tomcat up against kfsprd because if it is already up, then the script just reports that fact. This allows us a safe mechanism by which to verify that the rsh connection to the production host machine (Malta2) will function properly. Even though the Guffey to Guffey2 change worked fine, the Malta to Malta2 change caused a permissions issue with the rsh command. After discussing the problem with Ron, it was decided to modify the rsh to use root instead of jobsys which solved the problem with Malta2.

We should be "good to go" for tonight's production KFSX_OFFLINE/KFSX_ONLINE chains.

Janice.

Aborted Module Name: FAIDCFEX_RC.SWPCOFE_01

Date: Day: Time: Resolution:

05/28/13 Tue 10:04 Restarted by David.

Error log and follow up comments:

Running SWPCOFE MC:9.0.5

... going through args k=1 arg=-f

... going through args k=2 arg=-o

... going through args k=3 arg=/appworx/out/swpcofe_3091227.lis

Username: Connected.

Run Sequence Number:

Encountered Abort Condition

Message is: ABORT: Reconcilation file locked for this term. File not processed.

Craig,

I’ve created incident I04902 to run an update script to reset the swrpass_recon_lock to null.

Please reply to Candy & David when this is complete.

1 row.

update swrpass

set swrpass_recon_lock = null

where swrpass_term_code = '201310'

Phil.

I have run the update:

SQL>

update swrpass

set swrpass_recon_lock= null

where swrpass_term_code = '201310'

SQL> /

1 row updated.

SQL> commit;

Commit complete.

Craig.

Aborted Module Name: KFSXPDCH.SEND_MAIL_01

Date: Day: Time: Resolution:

06/08/11 Wed 14:35 Restarted by Janice.

Error log and follow up comments:

# - For pdp_check_20110608_142215.xml:

# -

# - Bank 02 Count: 4

# - Amount: $2200.00

# - Start Bank 02 check disbursement number: 804520

# -

# - Bank 05 Count: 10

# - Amount: $14557.75

# - Start Bank 05 check disbursement number: 128336

# -

# - ====================================================================

#------------------------------------------------------------------------------

# - Sending Message

# MIME::Lite version : 3.027

# MAIL COMMAND : smtp.colostate.edu , Debug => '0', Timeout => '60'

# BUILDING HEADERS

# BUILDING BODY

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

at /appworx/csu/exec/SENDMAIL.PL line 792

error is 255

===== Exiting PERL_CSU =====

+ err=255

As can be seen from the output log, the email was properly formatted but the module failed due to a connection problem to the mail server. In this case, we do not want to “re-do” all the complicated conditions associated with the chain component Therefore, I deleted all the conditions (in backlog) for the KFSXPDCH.SEND_MAIL_01 aborted component and then restarted the component. Prior to restarting, I also verified that the #WAIT_FOR_CHK_status_6391208 subvar contained the value of “checks” - this value triggers the CHAIN_FINISH component of the KFSXPDCH_PDP_CHECKS_EXTR to request in the KFSXBURS_FT_TRANSFER_TO_BURSAR to transfer the checks to bursar’s server. It was clear from the SEND_MAIL component output log that checks were produced – therefore the need to confirm that the #WAIT_FOR_CHK_status_6391208 had been properly populated.

Let me know if you have questions.

Janice.

Aborted Module Name: AREGDRGC_SP.WAIT_FOR_DARS_01

Date: Day: Time: Resolution:

01/22/10 Fri 00:06 No output File - Deleted per Vickie.

Error log and follow up comments:

The WAIT_FOR_DARS component aborted because within the job_queue_list table, status=E for the particular jobid (ba10012200041515) for which it was waiting. The status value must be “D” for successful completion of the WAIT_FOR_DARS component.

Janice.

Jamie is taking care of a data condition. So let’s delete the copy of AREGDRGC that aborted.

Denise Holcombe should be calling over to schedule it for tonight.Vicki.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

01/08/10 Fri 22:03 See notes below.

Error log and follow up comments:

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 675

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 791

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

22:02:17 791 raise_application_error(-20000, '**** FATAL ERROR! ****');

22:02:17 675 l_segment := csuh_edi_834_pkg.edi_dmg(

HR has fixed the error. You may restart the job or whatever you do to complete this job and create the file.

IT scheduling, when you send out error messages, can you please be sure to add this type of information if it exists, as this is what tells us what the real error is and which record it was processing when it failed (see red section).

3384959,523712401,"Cutler, Zachary Lucas",19,01/01/2008

095549905,****,"Haas, Donald Edward",EMP,Y,01/01/2008

095549905,,"Haas, Donald Edward",EMP,01/01/2008

095620720,****,"Jones, David S",FAM,Y,01/01/2008

095620720,,"Jones, David S",FAM,01/01/2008

095620720,359648679,"Phelan, Jane P",01,01/01/2008

095620720,536334618,"Phelan-Jones, Savanna B",19,01/01/2008

096469970,****,"O'Grady, Pamela S",FAM,N,01/01/2010

096469970,,"O'Grady, Pamela S",FAM,01/01/2010

096469970,074501050,"O'Grady, Thomas",01,01/01/2010

096469970,001880393,"O'Grady, Brennan",19,01/01/2010

096469970,003886229,"O'Grady, Connor",19,01/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 675

Kathy.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

01/25/10 Mon 14:15 Restarted by ITS.

Error log and follow up comments:

010-01-25 14:09:09,129 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.3848111.3848131.00 steps: [disbursementVoucherPreDisbursementProcessorExtractStep]

2010-01-25 14:09:09,129 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springmodules.orm.ojb.OjbOperationException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

RunBatch ERROR: Exception found:

org.springmodules.orm.ojb.OjbOperationException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

Caused by: org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

at org.apache.ojb.broker.accesslayer.JdbcAccessImpl.executeUpdate(JdbcAccessImpl.java:522)

at org.apache.ojb.broker.core.PersistenceBrokerImpl.storeToDb(PersistenceBrokerImpl.java:1918)

at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:886)

at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:923)

at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:793)

at org.apache.ojb.broker.core.DelegatingPersistenceBroker.store(DelegatingPersistenceBroker.java:220)

at org.springmodules.orm.ojb.PersistenceBrokerTemplate$9.doInPersistenceBroker(PersistenceBrokerTemplate.java:246)

at org.springmodules.orm.ojb.PersistenceBrokerTemplate.execute(PersistenceBrokerTemplate.java:141)

at org.springmodules.orm.ojb.PersistenceBrokerTemplate.store(PersistenceBrokerTemplate.java:244)

at org.kuali.rice.kns.dao.impl.DocumentDaoOjb.save(DocumentDaoOjb.java:62)

at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java:674)

at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocumentAndSaveAdHocRoutingRecipients(DocumentServiceImpl.java:350)

at org.kuali.rice.kns.service.impl.DocumentServiceImpl.saveDocument(DocumentServiceImpl.java:121)

at sun.reflect.GeneratedMethodAccessor372.invoke(Unknown Source)

Caused by: org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else:

Should be able to rerun the job. Looks like Frank E Johnson was in the middle of acknowledging this document (585020) at 2:06.

Kevin.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

01/25/10 Mon 19:18 Restarted by ITS.

Error log and follow up comments:

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

at org.objectweb.jotm.Current.commit(Current.java:488)

at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

Invalid email address, Lisa.Klopp@ColoState.EDU

I changed the email address to Purch_acard_help_desk@colostate.edu. Please rerun the job.

Kevin.

Is Linda.Zafarna@colostate.edu also a problem?

/ais02/log $ grep 'User unknown' *3850859* |more

KFSXFPPC.KFSX_JAVA_03.3850859.3850867.00.2010_01_25_1900.log: com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1

<Linda.Zafarana@colostate.edu>... User unknown

KFSXFPPC.KFSX_JAVA_03.3850859.3850867.00.2010_01_25_1900.log: com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1

Jan.

I set the email addresses to Purch_pg_acard_help_desk@mail.colostate.edu. Please run job.

Kevin.

On AWTEST/kfsdevl too?

Janice.

I updated kfsdevl for both of these invalid emails. Set to Purch_pg_acard_help_desk@mail.colostate.edu.

John Walker.

Aborted Module Name: ADMSAPPL.ADMSS481_01

Date: Day: Time: Resolution:

01/11/11 Tue 22:39 Restarted by ITS.

10/28/13 Mon 22:26 Restarted by Joleen.

Error log and follow up comments:

01/11/11.

ERROR at line 1:

ORA-00001: unique constraint (SATURN.SABSUPL_KEY_INDEX) violated

ORA-06512: at line 958

22:39:15 958 insert into sabsupl

22:39:15 959 (sabsupl.sabsupl_pidm,

22:39:15 960 sabsupl.sabsupl_term_code_entry,

22:39:15 961 sabsupl.sabsupl_appl_no,

22:39:15 962 sabsupl.sabsupl_city_birth,

22:39:15 963 sabsupl.sabsupl_natn_code_birth)

22:39:15 964 VALUES

22:39:15 965 (supl_rec.sabiden_pidm,

22:39:15 966 supl_rec.saradap_term_code_entry,

22:39:15 967 supl_rec.saradap_appl_no,

22:39:15 968 substr(supl_rec.swrlcit_birth_city,1,20),

22:39:15 969 supl_rec.swrlcit_birth_natn);

22:39:15 970 dbms_output.put_line('adding sabsupl recs for '||v_id||' '||cnt);

22:39:15 971 cnt := cnt + 1;

22:39:15 972 end loop;

By the way, it looks like a lot of displays being produced by this program, making the output file rather large. As an example, the applications "essays" appear to be echoed out - which of course can be quite lengthy and of questionable value for debugging purposes?

Janice.

Please restart the chain ..

The record that caused the unique constraint (SATURN.SABSUPL_KEY_INDEX) violated was deleted.

Rami.

10/28/13.

Similar error as reported 01/11/11.

The program was trying to add a sabsupl record when a record already existed. This student had submitted multiple applications with different birth city records. Janet Allen in Admissions deleted the existing sabsupl record. I put a temp version of ADMSS481 on Kebler that excluded the aidms from the students prior applications. The long term fix will be to include the run date criteria in the supl_cursor that is used in all of the other cursors - it's missing in this one.

Kathy.

Aborted Module Name: ODSRKFSX.ODSRS002_01

Date: Day: Time: Resolution:

02/01/10 Mon 00:06 Restarted by ITS.

05/03/11 Tue 00:08 See follow up below.

Error log and follow up comments:

02/01/10.

23:42:14 149 csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_FP_DV_OWNR_TYP_T');

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_FP_DV_OWNR_TYP_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 149

I have commented out the mapping from the ODSRS002. This mapping is no longer valid with the current version of KFS.

Mark.

05/03/11.

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_COFRS_DETAIL_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 140

There appears to be a tablespace issue. One of the DBA's will have to look at it when they get in.

Here is the error on the database:

ORA-01652: unable to extend temp segment by in tablespace

Mark.

Aborted Module Name: KFSXPDSA.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/03/10 Wed 08:13 Restarted by Jan.

Error log and follow up comments:

2010-02-03 08:13:47,832 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid

email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <cj.anderson@colostate.edu>... User unknown

;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <cj.anderson@colostate.edu>... User unknown

Jan.

I created Clarity incident I02405 to be assigned to a DBA to correct the invalid email and set to BFS_AcctPay@mail.colostate.edu.

Once the production table is updated, please notify IS Scheduling to re-run the job.

Someone in BFS should update the Payee ACH Account for Christopher Anderson and update his email address (to prevent future aborts).

update pdp_pmt_grp_t

set adv_email_addr = 'BFS_AcctPay@mail.colostate.edu'

where adv_email_addr = 'cj.anderson@colostate.edu'

and adv_email_snt_ts is null

Kevin.

If this is the standard sql statement which needs to be run for bad email addresses, maybe we could create an On-Request Applications Manager chain to run this sql, passing the “bad email address” into the sql as a parameter? Then, when such problems occur, BFS could provide a Control Memo to IT Scheduling to request that the chain run and provide the bad email address parameter value. Of course, BFS would still need to manually update the Payee ACH Account.

Just a thought – Janice.

There is a clarity task to fix the java program so it will send the email to bfs_acctpay and not fail.

Kevin.

Oh.. that’s even better! Thanks for the update…..Janice.

The Check and ACH jobs are held up until this job completes.

Kevin.

This change has been made in KFSPRD…….Kelly.

Aborted Module Name: FAIDCFIM_COF_IMPORT

Date: Day: Time: Resolution:

09/21/10 Tue 12:20 See notes below.

Error log and follow up comments:

Phil has just informed us that COF will not have a file available today for the FAIDCFIM_FA.COF_RESP_01 to process.

While this component would eventually abort when it doesn’t find the file, it would be best to simply handle the situation now.

Please proceed with the steps outlined below, in the order specified:

1) Kill the FAIDCFIM_FA.COF_RESP_01 component – it should end up in KILLED status

2) Delete all the chain components which are in PRED WAIT status, except for the FAIDCFIM_FA.CHAIN_FINISH_01 component.

3) Verify that all chain components which were deleted are in PW-DELETE status.

4) Delete the “KILLED” FAIDCFIM_FA.COF_RESP_01 component.

5) Verify that the FAIDCFIM_FA.CHAIN_FINISH_01 component finishes, thereby allowing FAIDCFIM_COF_IMPORT chain to complete.

Let me know if you have questions.

Janice.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

02/08/10 Mon 21:50 Restarted by ITS.

Error log and follow up comments:

21:50:51 676 l_segment := csuh_edi_834_pkg.edi_dmg(

21:50:51 677 p_dmg01 => 'D8'

21:50:51 678 ,p_dmg02 => to_char(c2.date_of_birth,'YYYYMMDD')

21:50:51 679 ,p_dmg03 => c2.sex);

21:50:51 680 write_segment(ws_file_handle, l_segment,l_segment_count);

21:50:51 681

21:50:51 787 when others then

21:50:51 788 dbms_output.put_line(substr(sqlerrm,1,250));

21:50:51 789 utl_file.fclose(ws_file_handle);

21:50:51 790 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);

21:50:51 791 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_backtrace);

21:50:51 792 raise_application_error(-20000, '**** FATAL ERROR! ****');

21:50:51 793

449983464,468623157,"Carpenter, Marian",01,02/01/2010

449983464,400477316,"Carpenter, Abigail",19,02/01/2010

449983464,404453088,"Carpenter, Blair",19,02/01/2010

Error Type: 1

Element Reference: DMG03

rror: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 676

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 792

Jennifer or Teri,

Can you please check out the Carpenter dependents and make sure they all have a gender? This job aborted this morning. We have updated the form to make this field required to eliminate this data issue, but it will not be in production until today or tomorrow.

Please let us and IT scheduling know when you have this fixed and they will restart the job.

Kathy.

Hi Jackie,

I request for you to investigate this as the responsible party for new hire entry in our office now. Let me know if you need any help. Teri

P.S. If you see the gender missing, just make sure you are date tracked appropriately and add it. Then notify Kathy to proceed with the file.

Aborted Module Name: AREGCNTB.ODSRS100_01

Date: Day: Time: Resolution:

02/08/10 Mon 08:40 Restarted by ITS.

Error log and follow up comments:

02:01:03 255 --*--------------------------------------------------------------------*

02:01:03 256 --************ ADD Records to CUR table from view course_schedule *****

02:01:03 257 --*--------------------------------------------------------------------*

02:01:03 258 begin <<add_cur3>>

02:01:03 259 insert into csus_applicant_cen_cur

02:01:03 260 (select * from csus_applicant

02:01:03 261 where ltrim(rtrim(term)) = csus_f_cur_term_ods);

02:01:03 262 end add_cur3;

02:01:03 263 v_add3_count := SQL%ROWCOUNT;

02:01:03 264

02:01:03 265 end del_applicant;

02:01:03 266

02:01:03 267 --*************************************************************************

02:01:03 268 -- FIELD OF STUDY

02:01:03 269 --*************************************************************************

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-00001: unique constraint (CSUBAN.CSUS_APPLICANT_CEN_CUR_IX_01) violated

ORA-06512: at line 259

The problem appears to be with PIDM = 11150486, APLN_REF_NUMBER = 3

There are two entries for this person. I will work with folks to figure out what the data issue is and we will get it resolved and finish creating the CENSUS Tables.

Vicki.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

02/09/10 Mon 08:40 Restarted by ITS.

Error log and follow up comments:

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 676

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 792

08:40:03 676 l_segment := csuh_edi_834_pkg.edi_dmg(

08:40:03 788 dbms_output.put_line(substr(sqlerrm,1,250));

08:40:03 789 utl_file.fclose(ws_file_handle);

08:40:03 790 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);

08:40:03 791 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_backtrace);

08:40:03 792 raise_application_error(-20000, '**** FATAL ERROR! ****');

I.T. Scheduling,

Can you please add line 676 and the lines around 676 plus the last couple of lines from the output so we can see what person it aborted on?

Kathy.

08:40:03 674

08:40:03 675 -- Member Demographics

08:40:03 676 l_segment := csuh_edi_834_pkg.edi_dmg(

08:40:03 677 p_dmg01 => 'D8'

08:40:03 678 ,p_dmg02 => to_char(c2.date_of_birth,'YYYYMMDD')

08:40:03 679 ,p_dmg03 => c2.sex);

08:40:03 680 write_segment(ws_file_handle, l_segment,l_segment_count);

08:40:03 681

08:40:03 682 -- Health Coverage

08:40:03 683 l_segment := csuh_edi_834_pkg.edi_hd(

08:40:03 684 p_hd01 => '030'

08:40:03 685 ,p_hd03 => 'DEN'

08:40:03 686

,p_hd04 =>

551797647,526992512,"Tjalkens, Kimberly",01,10/01/2009

551797647,645807196,"Tjalkens, Jacob C",19,10/01/2009

551797647,652071440,"Tjalkens, Jordan F",19,10/01/2009

551797647,627806573,"Tjalkens, Luke R",19,10/01/2009

551806928,****,"Machol, Janet Lynn",EMP,Y,10/01/2009

551806928,,"Machol, Janet Lynn",EMP,10/01/2009

551911231,****,"Schroder, Daniel James",FAM,Y,02/01/2010

551911231,,"Schroder, Daniel James",FAM,02/01/2010

551911231,341763917,"Schroder, Amy Gillings",01,02/01/2010

551911231,652388519,"Schroder, Finn Joseph",19,02/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 676

Thank you so much. This is exactly what we need and we can just forward this onto HR to fix without us having to do any investigative work.

Kg.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/08/10 Mon 15:02 Restarted by David.

Error log and follow up comments:

org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase.validateContentsAgainstSchema(XmlBatchInputFileTypeBase.java:172)

at org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase.parse(XmlBatchInputFileTypeBase.java:109)

at org.kuali.kfs.sys.batch.service.impl.BatchInputFileServiceImpl.parse(BatchInputFileServiceImpl.java:73)

at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:67)

... 4 more

Caused by: org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'transaction'. One of '{"http://www.kuali.org/kfs/fp/procurementCard":transactionCreditCardNumber}' is expected.

at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)

... 23 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

David.

Someone’s CC# was wrong. We suspect it was manually entered by Purchasing and they entered the last digit incorrectly. (they have been running some cards through the State’s PaymentNet system as a test).

The system have been corrected. KFSXS008 will need to be rerun. I am looking to see if there are anything else that needs to be cleaned up in the database. I will contact David when I am ready to rerun.

Kevin.

Aborted Module Name: AGENWYWP.AGENS004_01

Date: Day: Time: Resolution:

02/08/10 Mon 20:17 Restarted by ITS.

Error log and follow up comments:

18:00:46 559 put_report_line1('Persons Not Purged: ' || record_count1);

18:00:46 560 put_report_line2('Persons Purged: ' || record_count2);

18:00:46 561 -- report any records that were not able to process because the pidm

18:00:46 562 -- is not found in Banner

18:00:46 563 csug_notworked := 0;

18:00:46 564 v_rptlist1 := null;

18:00:46 565

18:00:46 566 select count(*) into csug_notworked

18:00:46 567 from csug_purge_ids

18:00:46 568 where marked_flag != 'Y';

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

ORA-04088: error during execution of trigger 'GENERAL.GT_GOREMAL_AS_LDI'

ORA-06512: at line 602

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

18:00:46 5 file_handle utl_file.file_type;

18:00:46 468 -- purgeable report

18:00:46 602 raise;

Robin: Not sure if this will make much sense, but in the errors:

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

ORA-04088: error during execution of trigger 'GENERAL.GT_GOREMAL_AS_LDI'

ORA-06512: at line 602

The only one that refers to the sql (AGENS004) is the last one. That’s why you couldn’t find line 675. Line 675 refers to the database package BANINST1.ICGOKCOM and so on for the other errors. We do need to see this entire error, but you won’t find anything helpful in the output for anything other than the last line.

Please restart AGENS004. We think we’ve got the data fixed.

Bev.

Aborted Module Name: HRMSCPR_QPS.HRMSS063_01

Date: Day: Time: Resolution:

02/11/10 Thu 08:18 Restarted by ITS.

Error log and follow up comments:

Thu Feb 11 08:18:31 :**** Start of HRMSS063 02/11/2010 08:18:28

Thu Feb 11 08:18:31 :Org Default Account: Williams,Kathleen Regular Salary 9 Month

Thu Feb 11 08:18:31 :685.00

Thu Feb 11 08:18:31 :Amount Not Distributed: Schwartz,Rachel Supp Pay Misc 533144

Thu Feb 11 08:18:31 :3262.26

Thu Feb 11 08:18:31 :declare

Thu Feb 11 08:18:31 :*

Thu Feb 11 08:18:31 :ERROR at line 1:

Thu Feb 11 08:18:31 :ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

Thu Feb 11 08:18:31 :ORA-06512: at line 1062

Jan.

This problem has been resolved. Account 5331440 was being used on the cost allocation key flex field but it was not on the GL_CODE_COMBINATIONS table. My guess is that this account has never been used on a labor schedule so GL_CODE_COMBINATIONS was never updated with the account value. I manually entered the value using the GL Account form which should cause the error to go away the next time the script is executed.

Steve.

Aborted Module Name: HRMSWKSP_01.AROSS142_01

Date: Day: Time: Resolution:

02/15/10 Mon 21:56 Deleted by ITS.

Error log and follow up comments:

21:55:17 253 dbms_output.put_line('Account Does Not Exist: ' ||

21:55:17 254 p_row.employee_csu_id|| ' ' ||

21:55:17 255 p_row.account_number ||' ' ||

21:55:17 256 va_subcode);

21:55:17 257 raise_application_error(-20100, 'Account Does Not Exist in TBRACCT, Account B');

21:55:17 258 end if;

21:55:17 259 end;

21:55:17 260 END;

21:55:17 488 begin

21:55:17 489 --Call the insert rows proc to create transactions and comments.

21:55:17 490 insert_rows(p_rec,

21:55:17 491 vt_hold,

21:55:17 492 vc_hold);

21:55:17 493 --Populate the Count Variables

21:55:17 494 vtran_count := nvl(vtran_count, 0) + nvl(vt_hold, 0);

21:55:17 495 vcomment_count := nvl(vcomment_count, 0) + nvl(vc_hold, 0);

21:55:17 496 end;

21:55:17 497 end loop;

ERROR at line 1:

ORA-20100: Account Does Not Exist in TBRACCT, Account B

ORA-06512: at line 257

ORA-06512: at line 490

Josh,

Is this that off-campus workstudy billing job? Appears a new account number needs to be setup in AROS?

Kevin.

The work study folks are out of the office today.

Josh.

We deleted the necessary modules and the chain completed.

Dawn.

Aborted Module Name: HRMSCRU_SAL.HRMS_SPAWN_LOG_01

Date: Day: Time: Resolution:

02/18/10 Thu 09:45 Restarted by Janice.

Error log and follow up comments:

+ print *** \n*** LOG FROM SPAWNED CONCURRENT REQUEST 4613252 (PARENT REQUEST 4613251): \n***

+ 1>> /ais01/dat/work/prod/HRMSCRU_SAL.HRMS_SPAWN_LOG_01.Spawned_Log

+ cat /oraapps/hrprod/log/l4613252.req

+ 1>> /ais01/dat/work/prod/HRMSCRU_SAL.HRMS_SPAWN_LOG_01.Spawned_Log

+ read this_spawned_req

+ grep C

+ cut -f2 -d ?

+ print 4613253?X

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

The restart of phase 1 will be tricky – Diane will work with Jan, David or me to facilitate the rerun. We will force the existing HRMSSAL1 Phase 1 chain to stay “stalled” by placing a hold on the downstream HRMSSAL1.CHAIN_EXIT_01 chain component. We plan to re-run portions of the HRMSCRU_COSTING_RUNGEN_SAL chain as a stand-alone chain. Since HRMSCRU_COSTING_RUNGEN_SAL is a “single run” chain, we deleted the ABORTED HRMSCRU_SAL.HRMS_SPAWN_LOG_01 and all remaining chain components for this subchain except for the HRMSCRU_SAL.CHAIN_FINISH_01. The stand-alone run of HRMSCRU_COSTING_RUNGEN_SAL cannot be done until the rollbacks are performed within HR – Diane will let us know when that has completed. No action will be required by IT Scheduling.

Joanne,

I’m not sure if we have ever been in this exact situation before. Unfortunately, the manner in which the HR processes were terminated did not communicate failure of those processes to the Applications Manager chain. Consequently, the RUNGEN component within the HRMSCRU_COSTING_RUNGEN_SAL sub-chain of the HRMSSAL1_PHASE_1_JOBS chain completed successfully – and there is no way for us to rerun that component within the existing HRMSSAL1 chain once it has completed within Applications Manager. However, one of the reasons that the Applications Manager Payroll chains were designed with sub-chains is to facilitate ease of re-running portions of the payroll processing in situations such as we have today. By rerunning the RUNGEN via a stand-alone execution of the HRMSCRU_COSTING_RUNGEN_SAL chain, it will most effectively allow us to re-run just the key component(s) that **need** to be rerun and will be much less error prone and less time consuming than trying to modify and rerun the entire HRMSSAL1_PHASE_1_JOBS chain.

Regarding a projected time-frame, I can tell you that historically the RUNGEN program averages between 1 – 1 ½ hours to run (when it starts at 03:00 A.M, which is an idle time as far as online activity). However, we cannot even proceed with the RUNGEN rerun until all the rollbacks are completed – Craig/Diane may be able to provide an update regarding progress of that.

If you have any more questions, please let me know.

Janice.

Fyi – the Database rollback process completed and the Payroll Run Rollback is currently underway………….Janice.

Salary payroll has resumed processing. Plan on about 1.5 hours…………..Diane

IT Scheduling:

Please monitor the progress of the stand-alone HRMSCRU_COSTING_RUNGEN chain. Upon successful completion of HRMSCRU_COSTING_RUNGEN (chain id 3957236 ), please release the hold on HRMSSAL1.CHAIN_EXIT_01 to allow the remainder of the HRMSSAL1_PHASE_1_JOBS chain to complete.

Janice.

The salary payroll job has finished. All assignments processed. 3 errors. The Payroll Exception Report is running right now.

Diane.

Aborted Module Name: KFSXAPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/22/10 Mon 07:29 Restarted by Janice.

Error log and follow up comments:

2010-02-22 07:29:08,831 [main] INFO org.kuali.rice.kew.docsearch.SearchableAttribute :: Indexing document 628342 for document search...

JVMDUMP006I Processing Dump Event "systhrow", detail "java/lang/OutOfMemoryError" - Please Wait.

…

JVMDUMP010I Java Dump written to /app/kfs/javacore.20100222.143045.414050.txt

JVMDUMP013I Processed Dump Event "systhrow", detail "java/lang/OutOfMemoryError".

Exception in thread "QuartzScheduler_QuartzSchedulerThread" Exception in thread "Timer-0" java.lang.OutOfMemoryError

java.lang.OutOfMemoryError

Complete java log can be viewed in:

/ais02/log/KFSXAPPD.KFSX_JAVA_01.3970577.3970615.00.2010_02_22_0701.log

Should we try to increase catalina_opts_memory?

Janice.

Aborted Module Name: KFSXPDSA.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/25/10 Thu 08:16 Restarted by Jan.

Error log and follow up comments:

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <hayley.brown@colostate.edu>... User unknown

at org.kuali.rice.kns.service.impl.MailServiceImpl.sendMessage(MailServiceImpl.java:63)

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

***

I02492 - Fix KFSXPDSA for 02/25/2010

Needs to be assigned to a dba. Once the update statements run the job can be restarted.

Kevin.

This task has been completed.

Kelly.

KFSXPDSA.KFSX_JAVA_01 failed again with below invalid user.

Jan.

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <james.kunesh@colostate.edu>... User unknown

;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <james.kunesh@colostate.edu>... User unknown

I added task T04672 to Incident I02492 to fix james.kunesh@colostate.edu.

Kevin.

Aborted Module Name: FAIDCFAT_SM_GLBDATA-LOOP_01

Date: Day: Time: Resolution:

02/26/10 Fri 06:19 Restarted by David.

Error log and follow up comments:

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

Here is the error from the report:

SUNGARD HIGHER EDUCATION

POPULATION SELECTION EXTRACT

CONTROL REPORT PAGE 1

Start Time: 26-FEB-2010 06:16:21

GLBDATA Version: 8.1

Selection ID 1: FAIDCFAT_SM_GRIP

Application: FINAID

Creator ID: FAUSER

*ERROR* FAIDCFAT_SM_GRIP query does not exist for Applicatio

SQLCODE = 1403

SQL ERROR = ORA-01403: no data found

Same error for Spring:

*ERROR* FAIDCFAT_SP_GRIP query does not exist for Applicatio

SQLCODE = 1403

SQL ERROR = ORA-01403: no data found

David.

Aborted Module Name: FAIDCFAT_SP.GLBDATA-LOOP_01

Date: Day: Time: Resolution:

02/26/10 Fri 06:06 Restarted by David.

Error log and follow up comments:

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

Here is the error from the report:

SUNGARD HIGHER EDUCATION

POPULATION SELECTION EXTRACT

CONTROL REPORT PAGE 1

Start Time: 26-FEB-2010 06:16:21

GLBDATA Version: 8.1

Selection ID 1: FAIDCFAT_SM_GRIP

Application: FINAID

Creator ID: FAUSER

*ERROR* FAIDCFAT_SM_GRIP query does not exist for Applicatio

SQLCODE = 1403

SQL ERROR = ORA-01403: no data found

Same error for Spring:

*ERROR* FAIDCFAT_SP_GRIP query does not exist for Applicatio

SQLCODE = 1403

SQL ERROR = ORA-01403: no data found

David.

Aborted Module Name: HRMSCHK_QPS.CHECK_WRITER_02

Date: Day: Time: Resolution:

02/26/10 Fri 08:10 Restarted by ITS.

Error log and follow up comments:

The following module is in DB ERROR status:

HRMSCHK_QPS.CHECK_WRITER_02

There is no output file. We also looked at the Before and Performed conditions. We viewed the operator log.

Can we just re-start this module?

Yes, Try to restart it.

David.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

03/01/10 Mon 19:20 Restarted by ITS.

03/10/10 Wed 19:01 Restarted by Janice.

Error log and follow up comments:

03/01/10.

Pcard transactions didn’t “push” out to people’s action lists.

Bad email?

John Hunter

KFSXFPPC.KFSX_JAVA_03 failed with a bad email address.

2010-03-01 19:13:55,618 [main] ERROR org.kuali.rice.kew.mail.service.impl.ActionListEmailServiceImpl :: Error sending Acti

on List email.

org.kuali.rice.kew.exception.WorkflowRuntimeException: javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Lisa.Klopp@ColoState.EDU>... User unknown

Jan.

I updated her email, go ahead and re-run.

John Hunter

03/10/10.

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXFPPC.procurementCardRouteDocumentsStep.4048572.4048580.00*.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

***

+ abort_job_flag=y

+ rm -ef /ais01/dat/work/prod/KFSXFPPC.KFSX_JAVA_03_jobstat

rm: Removing /ais01/dat/work/prod/KFSXFPPC.KFSX_JAVA_03_jobstat

Looks like it finished successfully at 7:31?

Kevin.

Looks like this java program failed last night with:

/ais02/job/temp/kfsx_java_ssh.ksh[79]: 270374 Segmentation fault(coredump)

I simply tried to rerun it this morning and it finished successfully.

Janice.

Aborted Module Name: AGENWYWP.AGENS004_01

Date: Day: Time: Resolution:

03/04/10 Thu 10:47 Deleted by ITS.

Error log and follow up comments:

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 602

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

+ /Applications Manager/exec/FILESIZE AGENWYWP.AGENS004_01.4020586.4020591.00.2010_03_04_1047.jobout 100

no output from AGENWYWP.AGENS004_01

+ err=100

We know why this job aborted. Joe will be in contact with Vicki regarding the person that we need to have purged from the system.

Marcella .

This job will hang until tomorrow per Vicki.

David.

Marcella confirmed that we can purge/drop/stop AGENWYWP. We have identified the data problem and are starting the process to clean that up. We will just catch up next week.

Vicki.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

03/05/10 Fri 21:42 Restarted by Jan.

Error log and follow up comments:

+ date

+ echo exiting SQLP_CSU Fri Mar 5 21:42:30 MST 2010

exiting SQLP_CSU Fri Mar 5 21:42:30 MST 2010

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

521151875,,"Collier, Daye Jamal",E1D,03/01/2010

521151875,995000026,"Collier, Dayesum Arthur",19,03/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 664

Declare

Jan.

Hi Teri and Jennifer,

The gender is missing for the following person. Please fix this ASAP and let IT Scheduling know so they can restart the job.

Thanks . Kg.

Aborted Module Name: OSYSJOBS_01.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement). I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully.

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_02.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_03.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_05.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_08.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_09.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_11.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_12.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_13.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_15.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_04.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_06.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_07.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: OSYSJOBS_14.OSYSPURG_01

Date: Day: Time: Resolution:

03/10/10 Wed 16:46 Restarted by ITS

03/20/10 Fri 16:32 Restarted by Jan.

Error log and follow up comments:

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

10 16:30:34-Null

Exiting with su job error code[2].

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

Aborted Module Name: HRMSCPR_SAL_HRMSS063_01

Date: Day: Time: Resolution:

03/23/10 Tue 13:53 Restarted by David.

Error log and follow up comments:

Org Default Account: Mashek,Kimberly Regular Salary

896.94

Org Default Account: Mashek,Kimberly Regular Salary

323.89

Org Default Account: Roberts,James Retro Salary

375.00

Org Default Account: Florcke,Cornelia Regular Salary

1470.00

declare

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1062

This should now be fixed so the HRMSS063 script can be restarted. Account 1206320 needed to be added to the GL code combinations table.

Steve Hill.

Aborted Module Name: DB ERROR’S

Date: Day: Time: Resolution:

03/15/10 Mon 23:00 Restarted by David.

05/17/10 Mon 10:00 See note from Janice below.

05/20/10 Thu 20:00 Restarted by David.

Error log and follow up comments:

I was paged because the following modules were in DB ERROR status:

WHRSL021.CHAIN_CANCEL_01

WHRSL036.CHAIN_CANCEL_01

The first one listed above did not have an output file, however the conditions were present with Timing of “BEFORE” and Performed of “DONE”, so I called the Oncall Programmer. The second one listed above had an output file present, so I called the Oncall Programmer. David Peterson was the person I spoke to.

Dawn.

WHRSL021 and WHRSL036 both had DB Errors on the CHAIN_CANCEL modules. I reset the conditions on WHRSL021 and reset it and it completed okay. WHRSL036 however is stuck in backlog. All modules are in finished status and I am unable to delete it. I added this chain to the exception file so WHRS could finish.

Followup:

Greg or Rich will need to delete WHRSL036 from backlog in the morning.

David.

The following module is in DB ERROR status.

HRMSACH_HRL.NACHA_01

This component went into DBERROR status when an AFTER condition attempted to run underlying sql (against HRPROD using the @hrprod_apps link from AWPROD) for the #HRMS_PAYROLL_ACTION_ID subvar.

ErrorMsg: AwE-5001 Database Query Error (5/17/10 9:49 AM)

Details: Error evaluating subvar #HRMS_PAYROLL_ACTION_ID for job: 4408012

java.sql.SQLException: ORA-12154: TNS:could not resolve the connect identifier specified

There is really no way to recover from this other than manually performing this AFTER condition since we cannot rerun this chain component. So, I determined the correct value for Payroll Action ID and updated the #HRMSACH_HRL_PAYROLL_ACTION_ID with the value (30752000). Then I deleted the HRMSACH_HRL.NACHA_01 chain component to allow the remainder of the chain to complete.

Janice.

05/20/2010 21:24 DBARRETT

Received DB Error page on KFSXGLSC_D1.COLLECT_FILES_01.

As there was an output file present, I called David who will investigate.

05/20/2010 22:28 DEPETERS

DB Error occured on KFSXGLSC_D1.COLLECT_FILES_01. I reset conditions and re-started it.

Aborted Module Name: OSYSJOBS_04.OSYSPURG_01

Date: Day: Time: Resolution:

03/29/10 Mon 16:30 Restarted by Jan.

Error log and follow up comments:

Rich called and was nice enough to let me know that it is sometimes easier to find an error in the output files if you look for code=. For example, if you open the output file on OSYSJOBS_04.OSYSPURG_01 that is in ABORTED status right now and you look for “code=”, then you will find the real reason the module aborted just above the place where you found “code=”.

See below: (The actual error is: The starting directory is not valid)

find: 0652-010 The starting directory is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.959#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.80#> [[ 1 > 0 ]]

<#errtrap_rsh.80#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

***

You would want to include the directory with it. It is usually a find or rm command that failed. This job has two output files, it failed on /work/tmp and then alm_orautil when I restarted it. The error is usually a short distance before it. It seems no one has pointed this out to you, but I think it is important ITS is told if they have the right message and learn from it.

Rich.

Aborted Module Name: VPLUS_LIST_KFSX_REPORT_ACCESS

Date: Day: Time: Resolution:

03/31/10 Wed 06:02 Restarted by ITS.

Error log and follow up comments:

I was checking my emails before getting ready for work and noticed the VPLUS_KFSX_PREPORT_01 job Aborted. Take a look at this error. It seems there is an invalid entry in the vplus_list_report_access.txt file which is causing the problem. Verify the name is correct and it exists.

Corrected report names and restarted module (see above)

echo \n*** REPORT:CSUFR091_enc_AA \n

+ 1>> /ais01/spool/vplus/out/vplus_list_kfsx_report_access.txt

+ vadmin com=gsr rep=RptViewN

params=FORMAT=COMMA,REPORT_NAME=CSUFR091_enc_AA

+ 1>> /ais01/spool/vplus/out/vplus_list_kfsx_report_access.txt

Invalid parameter value: CSUFR091_enc_AA.

+ exit 208

Child: Job return = 208

Rich.

I’m not sure how to proceed.

The user requested a name change for CSUFR091_enc_AA report on March 22^nd. The report name is now CSUFR_Encumbrance_AA.

Joleen.

This job failed because a report was renamed in Vista for kfsx and not renamed in the input file which reports on them.

You need to rename the report in file /ais01/spool/vplus/parms/vplus_list_kfsx_report_access_input

When I first built this list I sorted it. If you rename any files in it, you can do a sort command from the top line in ispf.

There might be others you need to fix based on when you renamed in Vista. The job failed where it got the first error, so maybe there are more after it.

Better check them all to be safe if not sure.

Rich.

Joleen, when the KFSX user requests to rename a report in vista plus, ITS will be responsible for modifying the report file /ais01/spool/vplus/parms/vplus_list_kfsx_report_access_input.

Please document this process and also include instructions of taking a screen shot before making any changes that are required.

Steve, thank you for modifying the file and once your done, please reset/restart VSTAJOBS_VISTA_RELATED_JOBS……….Debbie.

Aborted Module Name: KFSXTXPM.KFSX_JAVA_01

Date: Day: Time: Resolution:

04/02/10 Fri 10:47 Killed by Jan.

Error log and follow up comments:

2010-04-02 10:47:14,367 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) edu.csu.batch.exceptions.BatchServerException: Batch Server Exception:: :: Job Name - KFSXTXPM.payeeMasterExtractStep.4155985.4155986.00, StepName(s) - [payeeMasterExtractStep]

R unBatch ERROR: Exception found:

edu.csu.batch.exceptions.BatchServerException: Batch Server Exception:: :: Job Name - KFSXTXPM.payeeMasterExtractStep.4155985.4155986.00, StepName(s) - [payeeMasterExtractStep]

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:76)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

rSmart fixed the bug AFTER we did the cod merge for 3.0.1. I have pulled their code change into our project. It will need to be tested and applied to prod. So this chain can be killed.

I killed the chain.

Jan.

Aborted Module Name: KFSXPDAP.KFSX_JAVA_02

Date: Day: Time: Resolution:

04/06/10 Tue 08:09 Restarted by ITS.

Error log and follow up comments:

Caused by: java.net.ConnectException: A remote host refused an attempted connect operation.

at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:352)

at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:214)

at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:201)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:377)

at java.net.Socket.connect(Socket.java:530)

at java.net.Socket.connect(Socket.java:480)

at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:232)

at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:189)

at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1250)

... 32 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

There appears to have been a problem connecting to the mail server. The systems team verified that SMTP looks good on Malta, so can we try re-running the job.

Thanks……………John Walker.

Aborted Module Name: FAIDALCT.SSH_SFTP_DL_01

Date: Day: Time: Resolution:

08/31/12 Fri 08:31 Restarted by Joleen.

11/08/13 Fri 07:04 Restarted by Steve.

06/18/13 Fri 07:04 Restarted by Joleen.

Error log and follow up comments:

08/31/12.

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

I was able to manually connect to the server now. Therefore, as long as the Process Flow notes don't preclude it, you should be able to restart the component.

Elden.

11/08/13.

FAIDALCT.SSH_SFTP_DL_01 / SFTP_FILSEND is stalled on AWPROD with an EMPTY FILE status.

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

I restarted this and it finished…

Steve G.

06/18/13.

# > Write failed: Broken pipe

# > Connection closed

# > (255)

I restarted and the job finished.

Joleen.

Aborted Module Name: HRMSS041.SSH_SFTP_01

Date: Day: Time: Resolution:

04/16/10 Fri 21:38 Restarted by Jan.

Error log and follow up comments:

# - sftp

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_ebmstpa_csu" CSU@ftp.ebms.com

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

# > Someone could be eavesdropping on you right now (man-in-the-middle attack)!

# > It is also possible that the DSA host key has just been changed.

# > The fingerprint for the DSA key sent by the remote host is

# > 7e:8f:8e:dc:7f:74:f9:7b:0f:d9:13:95:32:12:e3:a4.

# > Please contact your system administrator.

# > Add correct host key in /home/jobprd/.ssh/known_hosts to get rid of this message.

# > Offending key in /home/jobprd/.ssh/known_hosts:10

# > DSA host key for ftp.ebms.com has changed and you have requested strict checking.

# > Host key verification failed.

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

I have asked HR to contact EBMS to let them know the server key has changed.

Diane.

Aborted Module Name: AREGORAH.AREGS608_01

Date: Day: Time: Resolution:

10/10/12 Wed 13:53 Restarted by Joleen.

Error log and follow up comments:

ERROR at line 1:

ORA-29283: invalid file operation

ORA-06512: at "SYS.UTL_FILE", line 536

ORA-29283: invalid file operation

ORA-06512: at line 234

Rob noticed that the userfile was missing a couple characters. I renamed the file and restarted the job. It completed successfully.

Joleen.

Aborted Module Name: KFSXPDAP.KFSX_JAVA_02

Date: Day: Time: Resolution:

04/27/10 Tue 08:02 Restarted by ITS.

Error log and follow up comments:

+ grep SCRIPT ABORTED /ais02/log/KFSXPDAP.KFSX_JAVA_02.4291790.4291814.00.2010_04_27_0802.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/KFSXPDAP.KFSX_JAVA_02.4291790.4291814.00.2010_04_27_0802.log

+ grep SCRIPT ABORTED

+ cut -f 2 -d =

ssh_return_code=1

+ [[ RunBatch = RunBatch ]]

+ grep KFSXPDAP.KFSX_JAVA_02 /ais01/dat/kfsx/prod/KFSX_PDF_PREFIX_TO_VPLUSRPT

+ chain_vplus_tempdir=/ais01/spool/vplus/temp/KFSXPDAP_tempdir

+ [[ RunBatch = RunBatch ]]

+ [[ -d /ais01/spool/vplus/temp/KFSXPDAP_tempdir ]]

+ print *** \n*** SSH EXECUTED SCRIPT kfsx_java_ssh.ksh EXIT CODE=1 \n*** EXIT WITH EXIT CODE=1 \n***

***

*** SSH EXECUTED SCRIPT kfsx_java_ssh.ksh EXIT CODE=1

*** EXIT WITH EXIT CODE=1

***

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

This appears to be a problem connecting to SMTP/Mail. We’ve had this in the past and only need to restart the job.

RunBatch ERROR: Exception found:

org.springframework.mail.MailSendException; nested exceptions (0) are:

Caused by: javax.mail.MessagingException: Could not connect to SMTP host: smtp.colostate.edu, port: 25;

nested exception is:

java.net.ConnectException: A remote host refused an attempted connect operation.

Thanks……….John Walker.

Aborted Module Name: FAIDEPLS_OD.LYNX_01

Date: Day: Time: Resolution:

11/30/12 Fri 14:06 Se follow up below.

Error log and follow up comments:

It seems their website is down !

Whom do we call ?

$ ls /appworx/out/LYNX_9435429.00.stdout.txt

/appworx/out/LYNX_9435429.00.stdout.txt

$ more /appworx/out/LYNX_9435429.00.stdout.txt

Gudrun.

I called Candy Chapman. The URL needed to be updated. I made the change and restarted the job. Candy is going to send a list of all the LYNX that have wsprod that need to be changed to wsnet.

Joleen.

From:

http://wsprod.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay={#2}&treq={#3}

To:

http://wsnet.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay={#2}&treq={#3}

Aborted Module Name: AROSFRQ1.AROS-PYMTS-LOOP-01

Date: Day: Time: Resolution:

04/30/10 Fri 08:35 Restarted by Janice.

Error log and follow up comments:

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

Janice.

Aborted Module Name: FAIDTMIM.TDCLIENT_01

Date: Day: Time: Resolution:

05/06/10 Thu 08:30 Restarted by David.

11/18/13 Mon 08:31 Restarted by David.

Error log and follow up comments:

05/06/10.

+ 1>> FAIDTMIM_receive_cmdfile

+ read this_receive_tdclient

+ tdclient_out=/ais01/dat/work/prod/FAIDTMIM.TDCLIENT_01.non_isir

+ tdclientc cmdfile=FAIDTMIM_receive_cmdfile

+ 1> /ais01/dat/work/prod/FAIDTMIM.TDCLIENT_01.non_isir 2>& 1

+ exit 21

+ err=21

+ [ 21 -eq 0 ]

+ [ 21 != 0 ]

+ status=ABORTD

Tom fixed the value in the #FAID_TDCLIENT_CUR_PASSWORD subvar. However, for the aborted component in backlog, changing the value in the subvar has no impact because the FAIDSPWD.TDCLIENT_01 prompt #3 value had already been populated with the old value from the #FAID_TDCLIENT_CUR_PASSWORD subvar when this component was originally submitted to run. Resolution of this problem could have been approached either of the following ways:

1) Delete the chain and request it to run again

2) Modify the FAIDSPWD.TDCLIENT_01 prompt #3 (current password) in backlog to match the value which Tom had updated to #FAID_TDCLIENT_CUR_PASSWORD subvar and restart the failed component.

David and I chose option #2 to resolve the abort. However, if this situation should occur in the future, it might be easier for IT Scheduling to choose option #1. By the way, this was an unusual situation – the #FAID_TDCLIENT_CUR_PASSWORD value is normally not changed by Tom prior to requesting FAIDSPWD. However, the SAIG userid had been suspended and the password was reset manually, thereby requiring an update to #FAID_TDCLIENT_CUR_PASSWORD which should have been done prior to requesting FAIDSPWD_TDCLIENT_CHG_PASSWORD.

Janice.

11/18/13.

# 20131118-083134 : pipe_exec | cmdout = <Bytes per second: sent 129.7, received 255.3

# 20131118-083134 : pipe_exec | cmdout = <debug1: Exit status 107

# 20131118-083134 : *** FATAL ***main::check_status | SEND_TO_CMD (close) [0] failed to execute (107) # 20131118-083134 : *** FATAL ***main::check_status | (100) # 20131118-083134 : *** FATAL ***main::check_status | #****************************************************************************************************

# 20131118-083134 : *** FATAL ***main::check_status | CMDOUT (close) [37880030] failed to execute (100) # 20131118-083134 : *** FATAL ***main::check_status | (100) # 20131118-083134 : *** FATAL ***main::check_status |

Looks like we had a connection problem. I re-started and it finished.

David.

Aborted Module Name: FAIDSPWD.TDCLIENT_01

Date: Day: Time: Resolution:

05/06/10 Thu 08:37 Restarted by David.

Error log and follow up comments:

Tom called to say he fixed the problem. Can I just re-set the module?

1) Delete the chain and request it to run again

2) Modify the FAIDSPWD.TDCLIENT_01 prompt #3 (current password) in backlog to match the value which Tom had updated to #FAID_TDCLIENT_CUR_PASSWORD subvar and restart the failed component.

Janice.

Aborted Module Name: RAMCSYNC.RAMCS001_FA

Date: Day: Time: Resolution:

05/11/10 Tue 08:50 See follow up by David below.

Error log and follow up comments:

08:50:06 SQL> /

old 4: currterm varchar2(6) := '&&report_term';

new 4: currterm varchar2(6) := '201090';

currterm varchar2(6) := '201090';

ERROR at line 4:

ORA-04052: error occurred when looking up remote object

WEBCT.PERSON@ELPROD_WEBCT

ORA-00604: error occurred at recursive SQL level 1

ORA-12519: TNS:no appropriate service handler found

08:50:06 3 --* GET INPUT PARAMETER FOR TERM

08:50:06 4 currterm varchar2(6) := '&&report_term';

08:50:06 5

08:50:06 6

I talked with Kelly and discovered that there is currently a database issue with ELPROD. The database will need to be re-started to resolve the problem, but this may not happen for awhile due to Finals currently underway. RAMCSYNC will fail until this issue is resolved.

David.

Aborted Module Name: PAGER – ON CALL DELAYS

Date: Day: Time: Resolution:

05/06/10 Thu 19:15 See follow up below.

Error log and follow up comments:

I was paged because KFSXAW03_START_KFSX_SCHEDULE had not started yet. It seemed to be waiting on KFSXSYFY_SYS_FISCAL_YR_MAKER which had an aborted module in it (KFSXSYFY.KFSX_JAVA_01). I tried to check into why the module aborted and could not figure it out. So, I called Janice and she said she would take care of it.

Dawn.

KFSXSYFY.KFSX_JAVA_01 failed with the following error:

010-05-06 19:15:18,043 [main] INFO

edu.csu.batch.service.RunBatch :: Finished executing job:

KFSXSYFY.fiscalYearMakerStep.4351849.4351850.00 steps:

[fiscalYearMakerStep]

2010-05-06 19:15:18,043 [main] INFO

edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception

(nested)

org.springframework.dao.DataIntegrityViolationException:

OJB operation; SQL []; ORA-02292: integrity constraint

(KFSUSER.GL_ENTRY_TR13) violated - child record found ; nested exception is java.sql.SQLException: ORA-02292:

integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

We've already run the "big" run of the KFSXSYFY chain back in April, so this would just be a run to add new entries since that run.

I think it can wait until in the A.M. to solve, so I deleted the failed component to allow the remainder of the schedule to complete.

Janice.

Fiscal Year Maker failed because there are entries in the GL table (4) that have a reversal date beyond 30-jun-2010.

RunBatch ERROR: Exception found:

org.springframework.dao.DataIntegrityViolationException: OJB operation; SQL []; ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found ; nested exception is java.sql.SQLException: ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

Caused by: java.sql.SQLException: ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

Kevin.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

05/11/10 Tue 19:27 Restarted by Jan.

Caused by:

javax.mail.SendFailedException: Invalid Addresses;

nested exception is:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Rachael.Crnich@ColoState.EDU>... User unknown

at com.sun.mail.smtp.SMTPTransport.rcptTo(SMTPTransport.java:1196)

at com.sun.mail.smtp.SMTPTransport.sendMessage(SMTPTransport.java:584)

at javax.mail.Transport.send0(Transport.java:169)

at javax.mail.Transport.send(Transport.java:98)

at org.kuali.rice.kew.mail.Mailer.sendMessage(Mailer.java:150)

at org.kuali.rice.kew.mail.Mailer.sendMessage(Mailer.java:170)

at org.kuali.rice.kew.mail.service.impl.DefaultEmailService.sendEmail(DefaultEmailService.java:66)

... 176 more

Caused by:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Rachael.Crnich@ColoState.EDU>... User unknown

at com.sun.mail.smtp.SMTPTransport.rcptTo(SMTPTransport.java:1047)

Matt,

Can you look into this. I suspect it is a bad email address. I thought the program had been changed to trap that error………………..Kevin.

Aborted Module Name: AGENWYWP.AGENS004_01

Date: Day: Time: Resolution:

05/25/10 Tue 23:02 Deleted by Janice.

06/02/10 Wed 19:14 Deleted by Janice.

Error log and follow up comments:

05/25/10.

old 8: outpath varchar2(255) := '&&utl_path';

new 8: outpath varchar2(255) := '/orautl/BANPROD';

old 9: not_purgeable_file varchar2(255) := '&&utl_file1';

new 9: not_purgeable_file varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file1';

old 10: error_file varchar2(255) := '&&utl_file2';

new 10: error_file varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file2';

old 11: purgeable_file varchar2(255) := '&&utl_file3';

new 11: purgeable_file varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file3';

**** Start of AGENS004 05/25/2010 18:00:46

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "SATURN.ST_SPBPERS_AS_LDI", line 5

ORA-04088: error during execution of trigger 'SATURN.ST_SPBPERS_AS_LDI'

ORA-06512: at line 602

I do not see why this program aborted. I am working with Joe Rymski to reduce the number of people being purged at one time.

So this will run again this evening (Joe said he has worked with IT Scheduling to have this chain run nightly for the next week.

So this abort has been “resolved” for today and we will try again tonight.

Vicki.

06/02/10.

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 602

18:00:46 SQL> DECLARE

18:00:46 2 -- Constants

8:00:46 602 raise;

just an FYI, helpful hint

I needed to see the UTL file output in order to determine what the problem was. The information was not sufficient………………..Vicki.

The utl files for banner sql’s live in this directory (if the sql has created a utl file):

/orautl/BANPROD……………………Jan.

Aborted Module Name: HRMSSERP_BI.HRMSS221_01

Date: Day: Time: Resolution:

08/13/10 Sat 03:42 Restarted by Janice.

03/22/14 Sat 04:31 Restarted by Robin.

Error log and follow up comments:

08/13/10.

ERROR at line 680:

ORA-06550: line 680, column 10:

PLS-00103: Encountered the symbol "ELSE" when expecting one of the following:

* & = - + ; < / > at in is mod remainder not rem <an exponent (**)> <> or != or ~= >= <= <> and or like like2

like4 likec between || multiset member submultiset The symbol ";" was substituted for "ELSE" to continue.

ORA-06550: line 682, column 10:

PLS-00103: Encountered the symbol "END" when expecting one of the following:

* & = - + ; < / > at in is mod remainder not rem <an exponent (**)> <> or != or ~= >= <= <> and or like like2

like4 likec between || multiset member

FYI - Since this failure was within the "update" portion of the HRMS schedule, the entire "non-update" portion of the HRMS schedule, plus the WHRS schedule, plus the EIDS schedule, plus ODSRHRMS/ODSREIDS refreshes from Friday night and Sunday night are held up waiting for resolution of this HRMS failure.

Resolution of this failure needs to occur this morning ASAP to allow all these waiting schedules to proceed!

What testing was performed for the recent changes to HRMSS221? Program was changed 8/4 and last production run was 7/30.

Janice.

I found the problem. I have changed the code and will now get it rolled to production asap.

-Bob-

03/22/14.

Total records for outfile1 = 99

Total records for outfile2 = 0

Total records for outfile3 = 1773

Total records for outfile4 = 11430

Failed: others

-30036 ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SPACE'

ERROR at line 1:

ORA-20000: Failed: -30036 ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SPACE'Possible error getting pay advice date

ORA-06512: at line 1196

The job finished successfully.

Jeff, I had to kill your work number concurrent program. The first one finished successfully and it had moved on to the second one which was killed.

Steve. H.

Aborted Module Name: AGENWYWP.AGENS004_01

Date: Day: Time: Resolution:

10/08/10 Fri 18:00 Aborted chain deleted by ITS as per Janice.

Error log and follow up comments:

ERROR at line 1:

ORA-01403: no data found

ORA-06512: at line 602

From the output utl_file, /orautl/BANPROD/AGENWYWP.AGENS004_01.utl_file1, we have the following additional information:

-29337790 -Purge A Shanta, Hanan Ali SPRIDEN record is newly created: -29337790

Program ended unexpectedly at CSU ID -29337790

SQLCODE/SQLERRM = 100, ORA-01403: no data found

Persons Not Purged: 1487

And from /orautl/BANPROD/AGENWYWP.AGENS004_01.utl_file3:

829285367 Kim, Lynn N

Program ended unexpectedly at CSU ID -29337790

SQLCODE/SQLERRM = 100, ORA-01403: no data found

Persons Purged: 14144

Janice.

Since a portion of this chain completed, please just delete the failed AGENS004_01 component and allow the remainder of the chain to complete.

Janice.

Aborted Module Name: AREGDYTR.CONVERT_PDFTOPS_01

Date: Day: Time: Resolution:

06/22/10 Tue 07:10 Restarted by Jan.

Error log and follow up comments:

# ERROR: File not found (/ais01/spool/out/AREGDYTR.AREGR600.4607666.PDF

# Exiting /appworx/csu/exec/CONVERT_PDF_TO_PS.KSH with Return Code (100)

error is 100

There is a problem with the report server. Josh was going to ask Mark Britton to bounce it. I’m not sure if this is related to this issue or not.

Vicki.

The report server is back up and AREGDYTR is complete.

Jan.

Aborted Module Name: KFSXBCGB.KFSXS034_01

Date: Day: Time: Resolution:

06/22/10 Tue 22:01 Restarted by ITS.

Error log and follow up comments:

2:01:04 674 --End Main Program Logic

22:01:04 675 END MAIN_LOGIC;

22:01:04 676 /

old 190: vfiscal_year ld_csf_tracker_t.univ_fiscal_yr%type := '&&univ_fiscal_year';

new 190: vfiscal_year ld_csf_tracker_t.univ_fiscal_yr%type := '2010';

22:01:04 45 select distinct

22:01:04 46 --p.position_nbr position_nbr

22:01:04 47 substr(p.name, 1, 6) position_nbr

22:01:04 48 ,nvl(a.effective_start_date,

22:01:04 49 p.effective_start_date) effdt

22:01:04 50 ,null obj_id

22:01:04 51 ,1 ver_nbr

22:01:04 52 ,substr(j.name,1, 6) jobcode

22:01:04 53 ,'A' pos_eff_status

22:01:04 54 ,substr(j.name, 1, 30) descr

22:01:04 55 ,substr(j.name, 1, 10) descrshort

22:01:04 56 ,'----' business_unit

22:01:04 57 ,nvl((select 'CO-' || o.attribute1

22:01:04 58 from hr_all_organization_units@hrprod o

select distinct

ERROR at line 45:

ORA-06550: line 45, column 1:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 45, column 1:

PL/SQL: SQL Statement ignored

ORA-06550: line 85, column 2:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 85, column 2:

PL/SQL: SQL Statement ignored

ORA-06550: line 140, column 1:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 140, column 1:

PL/SQL: SQL Statement ignored

I noticed that this sql has a link to hrprod – should it have any dependencies on HRMS chain(s)?

Janice.

Aborted Module Name: ODSRFAMS.ODSRS002_01

Date: Day: Time: Resolution:

06/22/10 Tue 23:22 See note below.

Error log and follow up comments:

************************************************/

/* Running LOAD_FAMIS_DEPT_SPACE_FUNC */

/* Started Tue Jun 22 2010, 23:22:16 */

/************************************************/

Stage 1: Decoding Parameters

| location_name=CSUFAMIS_LOCATION

| task_type=PLSQL

| task_name=LOAD_FAMIS_DEPT_SPACE_FUNC

Stage 2: Opening Task

declare

ERROR at line 1:

ORA-20001: Task not found - Please check the Task Type, Name and Location are

correct.

ORA-06512: at "OWBREP.WB_RT_API_EXEC", line 704

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 41

ORA-06512: at line 216

23:22:16 215 WHEN 'REFRESH_FAMIS' THEN

23:22:16 216 csug_run_owb_task('OWBREP', 'CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_DEPT_SPACE_FUNC');

23:22:16 217 csug_run_owb_task('OWBREP', 'CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_EMP_SPACE_DEPT');

23:22:16 218 ELSE

David.

Mike,

You will have to double check your syntax, the Task not found error indicates you have something wrong with the location name or load name.

Mark A. Paquette.

Aborted Module Name: AREGCNTB.ODSRS100_01

Date: Day: Time: Resolution:

06/26/10 Sat 02:53 Restarted by Joleen.

Error log and follow up comments:

02:53:52 258

02:53:52 259 --*--------------------------------------------------------------------*

02:53:52 260 --************ ADD Records to CUR table from view course_schedule *****

02:53:52 261 --*--------------------------------------------------------------------*

02:53:52 262 begin <<add_cur3>>

02:53:52 263 insert into csus_applicant_cen_cur

02:53:52 264 (select * from csus_applicant

02:53:52 265 where ltrim(rtrim(term)) = csus_f_cur_term_ods);

02:53:52 266 end add_cur3;

02:53:52 267 v_add3_count := SQL%ROWCOUNT;

02:53:52 268

02:53:52 269 end del_applicant;

02:53:52 270

2:53:52 577 --* / -- THIS EXECUTES THE PL/SQL BLOCK STORED IN THE BUFFER

02:53:52 578 --*--------------------------------------------------------------------*

02:53:52 579 .

02:53:52 SQL> /

**** Start of ODSRS100 06/26/2010 02:53:52

begin <<main_block>>

ERROR at line 1:

ORA-00001: unique constraint (CSUBAN.CSUS_APPLICANT_CEN_CUR_IX_01) violated

ORA-06512: at line 263

Jan.

Bev,

Can you please take a look at why we were not able to create the Census Applicant table Friday night?

I’ll take a look as well.

Vicki.

We know what is causing the problem, but are waiting to hear back from an end user (Jordan Fritts) to confirm and to fix the data.

We will wait until this afternoon, if we don’t hear back from Jordan by then, we have a plan on how to proceed.

Vicki.

Aborted Module Name: HRMSCHK_QPH.CHECK_WRITER_01

Date: Day: Time: Resolution:

12/28/10 Tue 08:07 See follow up below.

Error log and follow up comments:

We were not able to locate an error.

I’m not quite sure what happened with this one – it appears that a DB Error occurred while trying to evaluation AFTER conditions. Since no checks were produced and the #HRMSCHK_QPH_CHECKS_CH subvar contained the correct “NO_CHECKS” value, I just deleted the HRMSCHK_QPH.CHECK_WRITER_01 component to allow the chain to proceed. Since #HRMSCHK_QPH_CHECKS_CH=NO_CHECKS, the subsequent HRMSCHK_QPH.HRMSS201_01 and HRMSCHK_QPH.HRMSR218_01 components were then properly skipped.

Janice.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

06/30/10 Wed 23:32 Restarted by Jan.

Error log and follow up comments:

23:32:20 315

23:32:20 316 Select count(*) into ws_grp_count

23:32:20 317 from krim_grp_mbr_t

23:32:20 318 where MBR_ID = X.KFS_PRNCPL_ID

23:32:20 319 and MBR_TYP_CD = 'P'

23:32:20 320 and trunc(sysdate) between nvl(ACTV_FRM_DT,trunc(sysdate)) and nvl(ACTV_TO_DT,trunc(sysdate));

23:32:20 321 ws_grp_names := Null;

23:32:20 322 For xx in empl_grp_cursor (X.KFS_PRNCPL_ID) Loop

23:32:20 323 ws_grp_names := ltrim(ws_grp_names||' '||xx.grp_nm);

23:32:20 324 End Loop;

Terminated Employee: Selzer,Dan R 6196 dselzer INACTIVE CSU Ex-Employee

* cardholder=1

Terminated Employee: Shah,Alisha 46966 *25156* INACTIVE Employee

Terminated Employee: Shetter,David Owen 41910 *17876* INACTIVE CSU Ex-Employee

declare

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 323

Jan.

I increased the field size from 500 to 1500. Script is in temp. Hard to believe someones has that many groups.

Kevin.

Aborted Module Name: KFSXSYCC.KFSX_JAVA_01

Date: Day: Time: Resolution:

07/02/10 Fri 16:00 Deleted by Dawn.

Error log and follow up comments:

+ grep Loading DD /ais02/log/KFSXSYCC.KFSX_JAVA_01.4670272.4670274.00.2010_07_02_1600.log

+ 1> /dev/null

+ sed -n /errtrap_ssh/,$ p /ais02/log/KFSXSYCC.KFSX_JAVA_01.4670272.4670274.00.2010_07_02_1600.log

<#/ais02/job/temp/kfsx_java_ssh.ksh.80#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.23#> [[ 1 > 0 ]]

<#errtrap_ssh.23#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

We are trying to rebuild production KFS (so it is down). When rebuilding KFS, we cannot run AppWorx jobs (because the Java libraries are missing). Clear Cache is scheduled every 2 hours, so this failed.

This does not need to run. Can someone just cancel this job?

Thanks………………John W.

Aborted Module Name: KFSXBCUD.KFSXS037_01

Date: Day: Time: Resolution:

07/06/10 Tue 06:02 Restarted by Jan.

Error log and follow up comments:

ORA-04052: error occurred when looking up remote object

ORA-00604: error occurred at recursive SQL level 1

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

Josh,

Is this script okay to restart?

Jan.

Yes this script can be restarted.

Josh.

Aborted Module Name: WHRSL011.SQLLOAD-LOOP_01

Date: Day: Time: Resolution:

07/09/10 Fri 22:40 Restarted by Jan.

Error log and follow up comments:

+ print Failure in spawned loader - abort this module

Failure in spawned loader - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

07/09/2010 23:02 JMWILKIN

Just checking on the WHRSL011 (last prv fy loader job) - it failed in the SQLLOAD - looks like the whrs_prv_fy_exphist_00 table definition on ODSPROD does not match the whrs_cur_fy_exphist_00 - missing the SUBACCT column on the whrs_prv_fy_exphist_00. I'm guessing this was a new column added to the cur_fy and the like change was overlooked for the prv_fy version of the table? DBA will need to add column, so guess this will have to hang until Monday.

SQL*Loader-466: Column SUBACCT does not exist in table "CSUHR"."WHRS_PRV_FY_EXPHIST_00".

WHRSL011.SQLLOAD-LOOP_01 in LOADFAIL status is inhibiting the completion of WHRSAWY1_FYEND_ROLLOVER_TASKS.

Joleen.

WHRSL001 is complete with errors:

Record 8257: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8258: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8259: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8260: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8261: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8262: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual:7,maximum:6)

Table "CSUHR"."WHRS_PRV_FY_EXPHIST_00":

2178 Rows successfully loaded.

10000 Rows not loaded due to data errors.

0 Rows not loaded because all WHEN clauses were failed.

0 Rows not loaded because all fields were null.

Space allocated for bind array: 255420 bytes(258 rows)

Read buffer bytes: 1048576

Total logical records skipped: 0

Total logical records read: 12222

Total logical records rejected: 10000

Total logical records discarded: 0

Jan.

Aborted Module Name: HRMSCPR_HRL.HRMSS064_01

Date: Day: Time: Resolution:

03/18/11 Fri 14:18 Restarted by ITS.

Error log and follow up comments:

Amount Not Distributed: Virgin,Joanna 1450 6467700. 50.00

Frozen Acount: 193563 used 1313850

Frozen Acount: 202887 used 1301320

Frozen Acount: 256481 used 1354280

Frozen Acount: 236808 used 1354280

Frozen Acount: 246103 used 1354280

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 961

The problem is that the $50.00 amount for Virgin,Joanna is trying to be distributed into the 6467700 account. This account does not exist in the gl_code_combinations table.

Who is the appropriate person to decide if the incorrect account is being used or who is responsible for inserting this new account?

-Bob-

Vickie Schultz added the account to the student's labor schedule (which put it in the gl_code_combinations table). You should be able to rerun this chain now.

Kevin.

Aborted Module Name: HRMSENCD.HRMSS079_01

Date: Day: Time: Resolution:

07/27/10 Tue 18:00 See follow up below.

04/26/11 Tue 18:00 See follow up below.

05/29/13 Wed 18:01 See follow up below.

Error log and follow up comments:

07/27/2010 19:45 MUELLER

Steve called about the HRMSS079 failure. I talked to Diane, Steve and Craig. Craig found some hung sessions and killed them allowing HRMSS079 to complete and the HRMSENCD chain to continue.

07/27/2010 19:45 CPERRY

I got a call from Jan about the Encumbrance process erroring out trying to rebuild an index. The message was about 'Could not acquire resource'. I looked on kebler and there were a couple of hung forms sessions. I killed these and the process ran. They must have had a hold on the indexes that were trying to be rebuilt by the Encumbrance job.

04/26/2011 18:23 JMWILKIN

I noticed that a page went out for critical job failure in HRMSENCD chain. It's the error we sometimes see in HRMSS079 with resource busy. I tried to resubmit HRMSS079, but it failed again with same message. Might wait a bit and try again - if still no luck, then probably should give DBA a call to just check on the HR database?

Robin just called about the HRMSENCD since she had received the page. As we chatted about the course of action, I decided to try to restart one more time before we contacted anyone else... and it finished successfully.

05/29/2013 20:51 DMCINTOS

HRMSENCD.HRMSS079_01 is in critfail status. I looked into it. Had to contact the oncall DBA. Craig is looking into it. Craig informed me that he found a forms session that had locks on some of the PSP tables. He killed it. He asked me to restart the job and I did. It is running now.

Aborted Module Name: HRMSBURS_EM.SEND_MAIL_01

Date: Day: Time: Resolution:

10/29/10 Fri 13:20 See note from Janice below.

Error log and follow up comments:

SEND_MAIL components failed with the following error:

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

at /appworx/csu/exec/SENDMAIL.PL line 774

error is 255

I talked with Elden about this error and we decided to just retry the jobs. They completed successfully so apparently whatever caused the mail server problem has been corrected. Elden may pursue some follow-up with ACNS regarding the mail server.

Janice.

Aborted Module Name: DOITKFSX_RG.DOIT_GET_FILE_01

Date: Day: Time: Resolution:

01/17/13 Thu 10:54 Restarted by Steve.

Error log and follow up comments:

+ print *** \n*** DOIT_GET_FILE FAILED TO FIND DOIT FILES WITHIN LOOP COUNT MAX TRIES \n***

*** DOIT_GET_FILE FAILED TO FIND DOIT FILES WITHIN LOOP COUNT MAX TRIES

+ exit 100

I found references to this same error in our abort log, and in each case the job was restarted. I restarted this one and it seems to be finding files again now:

Steve. G.

Aborted Module Name: AROSDPA3_PAYMENT_APPLICATION_3

Date: Day: Time: Resolution:

08/16/10 Mon 11:31 Restarted by ITS.

Error log and follow up comments:

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.ROKODST", line 362

ORA-00001: unique constraint (FAISMGR.RORNCHG_INDEX_01) violated

ORA-06512: at "ODSMGR.ROKODST", line 16

ORA-06512: at line 1

ORA-06512: at "ODSMGR.GOKODST", line 69

ORA-06512: at "TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE", line 24

ORA-04088: error during execution of trigger 'TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE'

ORA-06512: at "BANINST1.DML_TBRACCD", line 68

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1685

ORA-06

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line 3,028

WRN-ERRSTMT: Following statement was last statement parsed:

begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N

tgrappl terminated with error

0 lines written to /appworx/out/AROSDPA3.TGRAPPL_01.4899213.4899217.00.2041209.lis

Jacque/Joe,

Can we restart Application of Payments 3 due to the deadlock error?

Josh.

It is OK to restart………… Jacque Clark.

I got this message when I tried to re-start. Please advise.

Starting TGRAPPL (Release 8.1.1.1)

* **WARNING** *

* You cannot submit this job - it is already running. *

* *

* You will also get this message if a previous run of *

* this program aborted. If this is the case, the *

* control record for that run must be deleted before *

* proceeding. (GJBPRUN record for this jobname with *

* a -1 one-up-no).

David.

The GJBPRUN table contains this record:

TGRAPPL

-1 01 16-AUG-10

Ctrl rec shows job-in-progress

I think we may need to delete this record in order to restart the job:

delete from gjbprun where gjbprun_one_up_no = '-1' and gjbprun_job = 'TGRAPPL'

Please advise……………Janice.

In the past, I remember seeing this error before. When a job aborts it sometimes puts a record in the gjbprun table with a one_up_number = -1. In the past we have deleted this record from gjbprun and the process was able to be re-run.

Mike Giebler.

OK, I see the -1 number out there.

I will verify that a version is not running. Have the record removed and restart the job…………….Josh.

Aborted Module Name: WHRSL023.HRMSS020_01

Date: Day: Time: Resolution:

08/17/10 Tue 22:05 Restarted by Janice.

Error log and follow up comments:

08/17/2010 22:26 JLHUTCH

I WAS PAGED AT 10:07 & 10:37 PM ABOUT A DB ERROR ON WHRSL023.HRMSS020_01. CONDITIONS WERE PRESENT WITH TIMING OF "BEFORE" AND PERFORMED OF "DONE".

I CALLED JANICE AND SHE SAID SHE WOULD TAKE A LOOK AT IT.

======================================================================

08/17/2010 22:50 JMWILKIN

The DBERROR was one of those wierd situations that we see occasionally where it appears that the so_log column of so_job_queue table for the jobid fills up. I updated the so_log column with a shorter entry - doublechecked the conditions which had already completed and then restarted the failed component.

Aborted Module Name: HRMSREC_SAL.HRMSS226_01

Date: Day: Time: Resolution:

09/20/10 Mon 10:24 Restarted by Janice.

Error log and follow up comments:

Mon Sep 20 10:24:20 :205 -ERR 171 Database error on 'SUBMIT_OAE_JOB' - ORA-20096: Cannot submit concurrent request for program HRMSS226

Check if the concurrent program is registered with Application Object Library.

Check if you specified the correct application short name for y

Mon Sep 20 10:24:20 :Error SUBMIT_OAE_JOB failed.

Mon Sep 20 10:24:21 MDT 2010

Contents of /appworx/out/o7145818:

Mon Sep 20 10:24:20 :Starting OAE Processing on jobid 5076812

Mon Sep 20 10:24:23 MDT 2010 Page 1

Concurrent Program Parameter(s)

Parameter Value

----------------------- ----------------------------------------

Responsibility CSU Human Resources Payroll

Program App. Short Name CSUH

Job to Run HRMSS226

salary_end_date 2010/09/30 00:00:00

email_listserv hrsao_campus_rec_dedn@mail.colostate.edu

utl_path

utl_file1

The new HRMSS226 program was not registered on hrprod (this is first time to run in production). Bob took care of that and I restarted the aborted HRMSREC_SAL.HRMSS226_01 chain component, which has now successfully completed.

Janice.

Aborted Module Name: AREGDYTR.SQLLOAD-LOOP_01

Date: Day: Time: Resolution:

08/26/10 Thu 07:16 See follow up below.

Error log and follow up comments:

+ egrep ABORTED|CRITFAIL

+ grep 4961448

4961448.00 BATCH AREGDYTR.SQLLOAD_01 08/26 07:16 00:00:01 ABORTED AREGDYTR_DAILY_TRANSCRIPT

+ print Failure in spawned loader - abort this module

Failure in spawned loader - abort this module

With looper scripts, such as SQLLOAD-LOOP, determination of the problem requires research into what caused the “spawned loader” to fail. The error message from AREGDYTR.SQLLOAD_01:

Record 2: Rejected - Error on table CSUBAN.SWLTNSC, column STREET1.

ORA-12899: value too large for column "CSUBAN"."SWLTNSC"."STREET1" (actual: 63, maximum: 40)

Table CSUBAN.SWLTNSC:

55 Rows successfully loaded.

1 Row not loaded due to data errors.

The data within “street1” of the bad record was not longer than 40 characters (actually it was only 33 characters):

OFFICE OF THE EDUCATIONAL ATTACH*

However, the last character, although displaying as *, is not really an asterisk character. The hex value for this character is C9, whereas a true * character has hex value of 2A. I simply edited the data file, /ais01/ftp/from/user/AREGDYTR.AREG.SWLTNSC.DAT, removed the offending “C9” character at the end of the street1 on record #2, resubmitted the SQLLOAD-LOOP and the newly spawned SQLLOAD component finished successfully.

Perhaps follow-up regarding this weird character should be done with National Student Clearinghouse, from whom this data originated (earlier in the chain).

Janice.

Aborted Module Name: FAIDCFEX_FA.SSH_SFTP_01

Date: Day: Time: Resolution:

06/20/13 Thu 06:18 See note from Joleen below.

Error log and follow up comments:

# LOGNAME : jobprd

# USER : jobprd

# SRC FILE : /ais01/ftp/to/user/FAIDCFEX_FA.2013_06_20_0600.gpg

# DST FILE : cofcsu@ftp.college-assist.org:query/FAIDCFEX_FA.2013_06_20_0600.gpg

# IDENTITY : /home/jobprd/.ssh/csu_infosys_prod

# DIR HOST :

I removed the aborted FAIDCFEX jobs. We will run these again when the COF server is available.

Joleen.

Aborted Module Name: HRMSDED_SAL.HRMSRPTS-LOOP_01

Date: Day: Time: Resolution:

09/03/10 Fri 15:06 Restarted by Janice.

Error log and follow up comments:

^{231 Error number from
open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE}

^{334 Check that the API
server is running.}

^{231 Error number from
open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE}

^{334 Check that the API
server is running.}

^{231 Error number from
open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE}

^{334 Check that the API
server is running.}

^{231 Error number from
open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE}

^{334 Check that the API
server is running.}

^***

^{*** END SEARCH OF
JOBLOG FOR ERROR STRINGS}

^{231 Error number from
open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE}

^{334 Check that the API
server is running.}

^{Could not open server
pipe.}

⁼⁼⁼⁼⁼⁼

^{Print failed!
File:/appworx/out/HRMSDED_HRL.HRMSRPTS-LOOP_01.5005908.5005913.00.2010_09_03_1458.AWPROD.LOG
Command:PRINTSIZE -d /appworx/out}

^{Fri Sep 3 15:06:00 MDT
2010 Done with BODY}

^{+ exit}

^{retry of sizing
successful}

^{Retry on
JOB_COMPLETION successful.}

This was related to the Appworx problem. I had Craig kill the hrprod report (HRMSR002) which the HRMSRPTS_LOOP had spawned (and was still running) so we could just restart the failed component over again from HRMSR002.

Janice.

Aborted Module Name: AROSFRQ1.AROS-PYMTS-LOOP-01

Date: Day: Time: Resolution:

07/08/10 Fri 17:09 Restarted by Steve.

Error log and follow up comments:

I found past references of this error in our "Abort log" document that indicated we restarted the component. I did this, and now AROSFRQ1.AROS-PYMTS-LOOP_01 is running again and has spawned another AROSFRQ1.AROS_PYMTS_01.

Steve.

Aborted Module Name: AREGHRTM_SECTION_ENROLLMENT

Date: Day: Time: Resolution:

09/07/10 Tue 07:20 Launch errors, see Janice’s note.

Error log and follow up comments:

AREGHRTM_SECTION_ENROLLMENT is cyclic chain, in that its chain schedules are defined to run hourly, 7 days per week. Based on the “Scheduled start date” and “Scheduled end date” for the chain schedules, only the AREGHRTM_FA schedule is currently active. Unfortunately, “Single run” cannot be specified for cyclic chains, as that interferes with multiple schedules (FA/SM/SP) running concurrently. Consequently, when failures occur, the cycles for a given schedule may back-up – i.e. there may be many iterations of the chain schedule within backlog, each with a failed/LAUNCH ERROR component. Such was the case with the AREGHRTM_FA schedule of AREGHRTM_SECTION_ENROLLMENT last night – this morning there were nine “stalled” iterations of this chain schedule, each with a chain component in LAUNCH ERROR status. In this type of situation, it would NOT be appropriate to restart LAUNCH ERROR components within all nine of these iterations at the same time – i.e. we would not want more than one iteration of a specific schedule of this chain running at the same time. Fortunately, this morning the components simply went back into LAUNCH ERROR status (due to BANPROD database problem) when they were reset multiple times – thereby saving us from the possibility of nine iterations for the same schedule running simultaneously. The appropriate method for cleanup in such situations would be to analyze status of the failed/stalled iterations of the chain, then delete iterations from backlog as appropriate until only one (or maybe no) iteration remains. There is no generic rule that can be applied because some chains with many components may require that a failed component be restarted, and the remainder of the chain be allowed to complete – whereas in other cases, it may be appropriate to simply delete the chain component and/or chain – thorough analysis of the situation is always required. However, in general, we can state that it would not be appropriate to take any action that would cause multiple iterations of the same chain schedule for a cyclic chain to run simultaneously.

To provide for more complete chain cleanup for this particular chain, I utilized the following procedure:

1) Delete the “LAUNCH ERROR” AREGHRTM_FA.AREGS415_01 component

2) Wait for AREGHRTM_FA. CHAIN_FINISH component and this iteration of AREGHRTM_SECTION_ENROLLMENT to complete

3) Repeat steps 1 and 2 for all iterations “backed up” in backlog for AREGHRTM_FA schedule of AREGHRTM_SECTION_ENROLLMENT

Note that since the AREGHRTM_FA chain schedule runs hourly, one iteration of the chain has successfully run since the BANPROD recycle was completed this morning. For this reason, there was no need to finish running any of the leftover iterations in backlog that had LAUNCH ERRORS

Janice.

Aborted Module Name: FAIDSAIG_EV.TDCLIENT_01

Date: Day: Time: Resolution:

11/03/10 Wed 00:04 Restarted by Janice.

Error log and follow up comments:

FROM KEBLER:

Executing Transfer SAIGPORTAL-IDAP10OP

-----------------------------------------------

********** Start Communications Session

Connecting to server SAIGPORTAL...

200 Command OK.

Connected.

FTP login failed.531 Change password required

(531) FTP login failed.531 Change password required

Termination started...

Disconnecting...

221 Goodbye.

********** End Communications Session

As the error message indicates, SAIG was requiring that we change our password. I incremented the SAIG password by updating #FAID_TDCLIENT_NEW_PASSWORD and requested the password change chain, FAIDSPWD_TDCLIENT_CHG_PASSWORD. Once this chain completed, I reset the failed TDCLIENT components/modules.

Janice.

Aborted Module Name: HRMSDED_DED.HRMSRPTS-LOOP_01

Date: Day: Time: Resolution:

09/23/10 Thu 09:32 Restarted by Janice.

Error log and follow up comments:

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 5095654

5095654.00 HRMS HRMSDED_DED.HRMSR05009/23 09:32 01:12:42 C-Error APPWORX HRMSDED_DEDUCTION_REPORTS

+ print Failure in spawned HRMSR050 - abort HRMSRPTS-LOOP

Failure in spawned HRMSR050 - abort HRMSRPTS-LOOP

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

When looping modules fail, it is helpful to provide the error message which caused the spawned module to fail.

In this case, from the HRMSDED_DED.HRMSR050_01 output file:

Thu Sep 23 09:32:03 :ORA-01555: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small

Thu Sep 23 09:32:03 : ==> Select /*+ RULE */ asg.assignment_number

Thu Sep 23 09:32:03 :REP-0069: Internal error

Thu Sep 23 09:32:03 :REP-57054: In-process job terminated:Terminated with error:

Thu Sep 23 09:32:03 :REP-300: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small

Thu Sep 23 09:32:03 : ==> Select /*+ RULE */ asg.assignment_number

We just restarted to try running this report again.

Janice.

Aborted Module Name: FAIDDLIM_OD.RERIM-LOOP_01

Date: Day: Time: Resolution:

10/06/10 Wed 08:40 See note from Janice below.

Error log and follow up comments:

5159723.00 BANNER FAIDDLIM_OD.RERIM11_10/06 08:40 00:00:01 ABORTED FAIDDLIM_DIRECT_LOAN_IMPORT

+ print Failure in spawned RERIM11 - abort this module

Failure in spawned RERIM11 - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

When we have failures of “looper” scripts, such as RERIM-LOOP, problem resolution requires understanding why the spawned module aborted. So, whenever RERIM-LOOP (or other looper scripts) issue a message such as:

Failure in spawned RERIM11 - abort this module

Then IT Scheduling should take the next step and also provide feedback regarding the failure within the spawned module, in this case – FAIDDLIM_OD_RERMI11_07 as shown below:

'crbn11op.00_10_01.00_10_02.00_10_05.xml'

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 39, maximum: 30)

This will facilitate resolution of the problem by providing all the pertinent information.

Janice.

Aborted Module Name: HRMSREC_SAL.WAIT_FOR_COND_01

Date: Day: Time: Resolution:

10/19/10 Tue 10:42 Restarted by ITS.

Error log and follow up comments:

From Operator Log Tab.

2010-10-19 10:00:03 Condition #1 inserted by OSU=appworx JDBC Thin Client

2010-10-19 10:00:03 Condition #2 inserted by OSU=appworx JDBC Thin Client

2010-10-19 10:42:35 Action argument from #AW99_{chain_id}=HRMSSAL1 to #AW99_5221916=HRMSSAL1 by OSU=appworx JDBC Thin Client

CON-2010-10-19 10:42:35 Set Subvar

CON-2010-10-19 10:42:47 Abort Task

This component failed because its BEFORE condition detected that the /userfiles/Uhrcrec/data/HRMSREC.CAMPUS.CSUH_CAMPUS_REC_TRANS.DAT file is empty. The BEFORE condition’s action is to ABORT TASK when this file does not exist or is empty. Apparently, we are not considering an empty feeder file as an acceptable situation.

Janice.

The campus recreation file from campus rec is empty. It aborted. Do you want us to skip the job if it is empty or should there always be data for us to pick up?

Diane.

There should always be data to pick up. I was told that the deadline was the end of business today. So I was getting ready to transfer the file this afternoon.

Jacqueline Nikolai.

The file is out there now.

Diane.

Aborted Module Name: AGENDYHB.SRRSRIN_01

Date: Day: Time: Resolution:

01/20/11 Thu 19:13 See follow up below.

Error log and follow up comments:

+ echo Program Failed to execute properly ..... program aborting

Program Failed to execute properly ..... program aborting

+ cat tempout.7813240

old: termout ON

new: termout OFF

one_up_is

+ grep SRRSRIN.lis /ais01/spool/vplus/parms/vplus_report_bookmark_titles

this_report_title=.Electronic_Prospect_Match

+ vplus_report_bookmark=AGENDYHB.SRRSRIN_01.Electronic_Prospect_Match

+ [[ -s /appworx/out/AGENDYHB.SRRSRIN_01.5668319.5668323.00.2182613.lis

+ ]] [[ ! -s /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis ]]

+ print -n -- \n\fREPORT : AGENDYHB.SRRSRIN_01.Electronic_Prospect_Match\n\f

+ 1>> /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis

+ cat /appworx/out/AGENDYHB.SRRSRIN_01.5668319.5668323.00.2182613.lis

+ 1>> /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis

+ exit 1

+ err=1

Can you tell me where the input file to SRRSRIN is located?............Vicki.

The input file to the SRTLOAD step is /ais01/bkp/AGENDYHB.HRMSS095_01.5668319.BKP. I think that SRRSRIN then processes the data from the SRTLOAD step……………David.

Aborted Module Name: HRMSKFS_QPH.HRMSS175_01

Date: Day: Time: Resolution:

10/21/10 Thu 08:23 Deleted by ITS.

Error log and follow up comments:

8:23:37 1627 utl_file.fclose(out_file1);

08:23:37 1628 utl_file.fclose(out_file2);

08:23:37 1629

08:23:37 1630 If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

08:23:37 1631 DBMS_OUTPUT.PUT_LINE('###################################################');

08:23:37 1632 DBMS_OUTPUT.PUT_LINE('##### #####');

08:23:37 1633 DBMS_OUTPUT.PUT_LINE('#### ####');

08:23:37 1634 DBMS_OUTPUT.PUT_LINE('### KFS FILE IS OUT OF BALANCE ###');

08:23:37 1635 DBMS_OUTPUT.PUT_LINE('#### ####');

08:23:37 1636 DBMS_OUTPUT.PUT_LINE('##### #####');

08:23:37 1637 DBMS_OUTPUT.PUT_LINE('###################################################');

08:23:37 1638 -- RAISE kfs_not_balanced;

08:23:37 1639 End if;

08:23:37 1640

08:23:37 1641 DBMS_OUTPUT.PUT_LINE('.');

08:23:37 1642 DBMS_OUTPUT.PUT_LINE

08:23:37 1643 ('**** End of HRMSS175 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

08:23:37 1644

08:23:37 1645 Exception

08:23:37 1646 When null_params Then

08:23:37 1647 raise_application_error(-20000, '**** FATAL ERROR - PARAMATER MISSING! ****');

08:23:37 1648 When kfs_not_balanced Then

08:23:37 1649 Rollback;

08:23:37 1650 raise_application_error(-20000, '**** FATAL ERROR - KFS FILE IS OUT OF BALANCE! ****');

08:23:37 1651 END;

08:23:37 1652 /

Just wondering why we need to send this email? The earlier email, with error message and attached joblog, went to the same distribution list as this email…. so these folks would already be aware of the abort and their responsibility to fix it.

Janice.

There doesn’t appear to be a problem. The best we can tell is that it’s an unusual timing issue. Let’s see what happens after the next run in about 2 hours.

Bob. V.`

Aborted Module Name: HRMSQPD_QUICK_PAY_INIT_N_SCHED

Date: Day: Time: Resolution:

10/22/10 Fri 10:00 – 13:00 See notes below.

Error log and follow up comments:

We were expecting an Hourly check this morning but haven’t been notified the 10:00 Quick Pays have run yet. Do you know what is going on?

Viv.

There seems to be a problem with today’s HRMSQPD_QUICK_PAY_INIT_N_SCHED chain. The HRMSCHK_CHECK_PROCESSING_QPH_1 module is in SELF WAIT status, but we can’t figure out why. This appears to have stalled the chain. Please advise, Thanks... Steve. G.

It looks like the HRMSCHK_CHECK_PROCESSING chain is currently running in HRMSSAL, so HRMSQPD is waiting for the HRMSSAL version to finish.

David.

Thanks, David. We’ll make a note of that possible scenario for the future.

Vivian,

See David’s reply above. I don’t know how long it may take for the HRMSSAL version of HRMSCHK_CHECK_PROCESSING to finish, but apparently the quick pay version will not run until it is done. It’s possible that by chance we just have never had this timing issue before when running Salary Phase 4. We’ll keep an eye on it.

Steve. G.

Kicked back in after 13:05 and HRMSQPD_QUICK_PAY_INIT_N_SCHED is running once more.

Dermot.

Aborted Module Name: AREGDYWL_FA.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

10/24/10 Sun 23:01 See note from Janice below.

Error log and follow up comments:

+ awexe jh

+ egrep ABORTED|CRITFAIL

+ grep 5250863

5250863.00 BATCH AREGDYWL_FA.VPLUS_RC10/24 23:01 00:00:00 ABORTED AREGDYWL_STOP_WAIT_LIST_NOTIFY

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

As this error indicates, the spawned module aborted. Therefore, to determine the cause of the error it is important to view the error message from the joblog for the spawned AREGDYWL_FA.VPLUS_RCAPTURE_01 aborted module:

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu on port: 7980

+ exit 4

error is 4

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

Janice.

Aborted Module Name: OSYSJOBS_05.OSYSLLNK_01

Date: Day: Time: Resolution:

10/31/10 Sun 16:30 See note from Janice below.

Error log and follow up comments:

Remote Shell errtrap_rsh parm 2 value is 3 <<errtrap_rsh.3>> [[ 3 > 0 ]] <<errtrap_rsh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=3 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=3

***

<<errtrap_rsh.7>> exit 3

+ grep SCRIPT ABORTED

+ /ais02/log/OSYSJOBS_05.OSYSLLNK_01.5287184.5287190.00.2010_10_31_1630.

+ log

+ 1> /dev/null

+ + cut -f 2 -d =

+ grep ^*** ERROR:

+ /ais02/log/OSYSJOBS_05.OSYSLLNK_01.5287184.5287190.00.2010_10_31_1630.

+ log

+ grep SCRIPT ABORTED

rsh_return_code=3

+ print *** \n*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=3

+ \n*** EXIT WITH EXIT CODE=3 \n***

***

*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=3

*** EXIT WITH EXIT CODE=3

When the errtrap routine is invoked, this means that there was a non-zero return code for the last command which executed. Therefore, it is important to include that last command, along with an associated error messages for troubleshooting the problem. In this case:

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_ban_too_old is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_slave_files is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_15.OSYSPURG_01.5287288.5287292.00_too_old is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 3 Remote Shell errtrap_rsh parm 2 value is 3

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation, which I did for the two OSYSJOBS this morning and they have successfully completed.

Janice.

Aborted Module Name: AREGDYWL_FA.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

04/14/13 Sun 01:02 Restarted by Elden.

Error log and follow up comments:

04/14/2013 01:20 EFLICK

AREGDYWL_FA.VPLUS_RCAP-LOOP failed with VistaPlus network error. Confirmed report was not captured to VistaPlus.

Tried resetting it, but it aborted again. Need to try again after the hierarchy builder runs and refreshes Vista Plus.

Aborted Module Name: AREGDYWL_SP.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

11/07/10 Sun 23:01 See note from Janice below.

01/02/10 Sun 23:02 No follow up received.

Error log and follow up comments:

11/07/10.

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu on port: 7980

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu on port: 7980

+ exit 4

error is 4

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

Janice.

01/02/11.

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print AREGDYWL_SP.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

While it was fine in this case to reset AREGDYWL_SP.VPLUS_RCAP-LOOP_01, it is important to verify why the module actually aborted. The error from AREGDYWL_SP.VPLUS_RCAP-LOOP_01:

+ print Failure in spawned VPLUS_RCAP - abort this module Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

So appropriate follow-up would need to include verifying why the spawned module (AREGDYWL_SP.VPLUS_RCAPTURE_01) failed:

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_SP -m -ips /ais01/spool/vplus/out/VPC.938276/AREGDYWL_SP.5583184.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu on port: 7980

+ exit 4 error is 4

From the original email, it does not appear that the log for the spawned AREGDYWL_SP.VPLUS_RCAPTURE_01 module was examined. While the end result of resetting the AREGDYWL_SP.VPLUS_RCAP-LOOP_01 was the appropriate course of action, that was only the case because of the particular reason that the spawned AREGDYWL_SP.VPLUS_RCAPTURE_01 failed.

Janice.

Aborted Module Name: AREGDYTR.CONVERT_PDFTOPS_01

Date: Day: Time: Resolution:

11/10/10 Wed 07:15 See note from Janice below.

12/01/10 Wed 07:18 No follow up received.

Error log and follow up comments:

11/10/10.

Due to the error in the AREGR600, there was no output PDF file from AREGR600 to be used as input to the CONVERT_PDFTOPS component. I deleted this failed component, as well as the subsequent SPOOL_FILTER component which would have spooled the “postscript” output from the CONVERT_PDFTOPS component.

*** Oracle Report: AREGR600

Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGDYTR_DAILY_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

req_levl=AL

p_source=B

PDFEMBED=YES

*** Report Errors:

REP-0177: Error while running in remote server

Unable to retrieve a string from the Report Builder message file.

REP--002:

Janice.

12/01/10.

The before Condition Details indicate it is checking for a file that does not exist:

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

*** Oracle Report: AREGR600

Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGORTR_ONREQ_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

p_reprint_date=01-DEC-2010

req_levl=AL

p_source=R

*** Report Errors:

REP-0177: Error while running in remote server

Unable to connect to the specified database.

Module still running: AGENDYGN.AGENS024_01

Date: Day: Time: Resolution:

11/12/10 Fri 09:00 See note from Janice below.

Error log and follow up comments:

AGENDYGN_DAILY_GENERAL is still running from last night’s schedule, Normal run time is 2 – 4 hours.

Module AGENDYGN.AGENS024_01 is the step that is currently running which normally takes less than a minute to complete.

Dermot.

I consulted with Mark Britton regarding this – there seems to be some stall with the link being used in AGENS024.sql between BANPROD and HRPROD. I killed the Appworx AGENDYGN.AGENS024_01 component – Mark will do cleanup on the databases to ensure that the associated processes are no longer running. Once that cleanup has been completed, we’ll restart the KILLED AGENDYGN.AGENS024_01 component.

The problem is happening again so I killed the Appworx AGENDYGN.AGENS024_01 component.

DBA’s are looking into the problem, which may take a while to resolve.

Janice.

If you killed this job in Appworx, I will kill the sessions in the hrprod database.

They are still running in there.

Craig.

Yes – it has been killed in Appworx………Janice

Mark modified the AGENS024.sql script to work-around the problem. I restarted the KILLED AGENDYGN.AGENS024_01 component and it successfully completed in 47 seconds!

IT Scheduling should flag /ais01/src/sql/temp/AGENS024.sql as okay through the end of November.

As follow-up, we’ll need a Clarity incident to document the AGENS024.sql change/add modlog entry to sql/etc. and migrate it into production.

Janice.

Aborted Module Name: HRMSS041.HRMSS041_01

Date: Day: Time: Resolution:

11/12/10 Fri 20:40 See note from Bob below.

Error log and follow up comments:

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 627

20:40:53 627 l_segment := csuh_edi_834_pkg.edi_ins(

declare

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 798***

In researching, I see that the error stems from the fact that “Jones, Kelley” does not have a contact_type_code. This code is derived from taking the contact_type (in the per_contact_relationships table) and decoding it to some other value. In this case the contact_type is “P” and it is not decoded to anything.

To resolve this issue, it needs to be determined what “P” (contact_type) should be decoded to for a contact_type_code and this added to the csr_dep cursor.

-Bob-

So who needs to make this determination?

Janice.

Chris D. was notified. We’re waiting for her decision.

Thanks, -Bob-

The data that caused this problem has been changed.

Please rerun the HRMSS041 script. -Bob-

Aborted Module Name: HRMSMLV_HRL.CHAIN_FINISH_01

Date: Day: Time: Resolution:

11/15/10 Mon 09:04 See note from Janice below.

Error log and follow up comments:

+ print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE

+ FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/HRMSMLV_HRL.CHAIN_FINISH_01.5364967.5364972.00.2010_11_15

+ _0904.AWPROD.LOG print *** \n*** END SEARCH OF JOBLOG FOR ERROR

+ STRINGS \n***

+ 1>> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ cat /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01.PREFIX.spool.lis.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

This was a timing issue with the HRMSMLV_HRL.CHAIN_FINISH and a spawned OAE_FEEDBACK sub_chain CHAIN_FINISH. I've modified the global PREFIX script to uniquely qualify the PREFIX generated spool.lis file with the jobid. This should prevent such timing issues in the future. The component was restarted and completed successfully.

Janice.

Aborted Module Name: APMXMISC.APMXLOGP_01

Date: Day: Time: Resolution:

11/15/10 Mon 13:00 See note from Janice below.

11/22/10 Mon 10:41 See note from David below.

Error log and follow up comments:

11/15/10.

rm: Removing ./KFSXGLSC_D1.5330031.txt

rm: Removing ./KFSXGLSC_D2.5330303.txt

rm: Removing ./KFSXGLSF.5330163.txt

rm: Removing ./KFSXPDDR.KFSX_JAVA_01.5326775.pdf

rm: Removing ./KFSXPDGL.5330182.txt

rm: 0653-603 Cannot remove directory ./VPC.2318484.

rm: Removing ./VPC.2318484/AREGDYWL_SP.5325961.txt.ps

+ exit 2

The bulk of time required to run the APWXMISC.APWXLOGP_01 component is for the joblog cleanup. The process to determine which joblogs to delete is complex since the analysis must be done to retain 14 days and/or latest 5 generations for each chain schedule of each chain. Historically, it takes about 45 minutes to run this portion of the APWXLOGP job. Whenever restarting a failed APWXMISC.APWXLOGP_01 component, the "Skip joblog cleanup" prompt for this component should be changed to Y **IF** the failure occurred after the completion of the joblog cleanup.

As an example, in this failure, the joblog indicates removal of joblog files had begun and completed:

*** 11/15/2010-13:47:01

*** REMOVE /ais01/joblog FILES

.... feedback for various joblog removals

*** 11/15/2010-13:47:05

*** END COMMON CLEANUP FOR APPWORX AWPROD AGENT

The failure occurred subsequent to the joblog cleanup... in the section to remove /ais01/spool/vplus/out files:

*** 11/15/2010-13:47:07

*** REMOVE /ais01/spool/vplus/out FILES

So, in this case, the restart should have been done with the "Skip joblog cleanup" prompt to Y - which would have skipped this time intensive joblog analysis/removal process.

By the way, the "Skip joblog cleanup" prompt change to "Y" would need to take place in **BACKLOG** prior to restarting the failed APWXMISC.APWXLOGP_01 component.

Janice.

11/22/10.

We were trying to figure out this abort. Janice sent an e-mail on 11/15/10 about a similar abort She mentioned that if the clean up process had already started, then a prompt flag would have to be changed before re-starting this module. We could not find any place that indicated that the clean up process had started. Did you have to change the prompt flag before restarting this module?

We are just trying to learn.

Yes,

I modified the ‘Skip joblog cleanup’ from N to Y. I did this because the failure occurred after the ‘COMMON’ purges had completed. I will have IT scheduling perform this next time this happens. It seems to happen quite often.

David.

Aborted Module Name: FAIDCFAT_FA.GLBDATA_01

Date: Day: Time: Resolution:

11/16/10 Tue 06:00 See follow up below.

Error log and follow up comments:

+ FAIDCFAT_SP.GLBDATA_01 + FAIDDYNT_EV.GLBDATA_03 + FAIDTRAK_EV.GLBDATA-LOOP_01

*ERROR* DURING PREPARE PARM2...ABORTING

SQLCODE = 0942

SQL ERROR = ORA-00942: table or view does not exist

X01 ROLLBACK SQLCODE=0000

X01 COMMIT (1) SQLCODE=0000

SQLCODE = 0000

ORA-01403: no data found

DQY-ABORT ROLLBACK SQLCODE = 0000

Tom Biedscheid is aware of the various GLBDATA failures - it has to do with the Banner Financial Aid upgrade over the weekend. IT Scheduling doesn't need to send the joblogs/email for all of these GLBDATA failures - pretty sure if we find the problem with one, it will fix all...

Here’s the Control Report from FAIDTRAK – the other failures look similar:

SUNGARD HIGHER EDUCATION

POPULATION SELECTION EXTRACT

CONTROL REPORT PAGE 1

Start Time: 16-NOV-2010 00:24:17

GLBDATA Version: 8.3.0.5

Selection ID 1: FAIDTRAK_EV_TRACK_GROUP

Application: FINAID

Creator ID: FAUSER

*ERROR* DURING PREPARE PARM2...ABORTING

SQLCODE = 0942

SQL ERROR = ORA-00942: table or view does not exist

X01 ROLLBACK SQLCODE=0000

X01 COMMIT (1) SQLCODE=0000

SQLCODE = 0000

ORA-01403: no data found

DQY-ABORT ROLLBACK SQLCODE = 0000

Janice.

Aborted Module Name: OSYSJOBS_04.OSYSLLNK_01

Date: Day: Time: Resolution:

11/21/10 Sun 16:30 Restarted by Janice.

Error log and follow up comments:

+ grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSLLNK_01.5398240.5398246.00.2010_11_21_1630.log

+ grep SCRIPT ABORTED

rsh_return_code=1

+ print *** \n*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=1 \n*** EXIT WITH EXIT CODE=1 \n***

***

*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=1

*** EXIT WITH EXIT CODE=1

I'm forwarding this email which I sent early in November regarding a couple OSYSLLNK failures. Please review this email for instructions on troubleshooting OSYSLLNK failures and as was indicated near the end of the email:

Based on the previously communicated information, IT Scheduling could have just restarted the two failed OSYSLLNK components this morning because they both failed with similar 'The status on "filename here" is not valid' errors.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_ban_too_old is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_slave_files is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_15.OSYSPURG_01.5287288.5287292.00_too_old is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 3 Remote Shell errtrap_rsh parm 2 value is 3

Janice.

I did a quick check on this, I think the error was a little farther back in the error log.

Check out the find command which produced the error.

I believe we get these once in a while on temp being created and deleted while cleanup jobs are run.

Rich.

<</ais02/job/prod/kshexe_rsh.70>> sys_llnk_rsh.ksh

<#/ais02/job/prod/sys_llnk_rsh.ksh.23#> alias log=echo "*** " $(date +%m/%d/%Y-%T)

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /app/dars/darsprod/dars35/bin/temp/dumpww10110712281616 is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 1

Aborted Module Name: HRMSKFS_QPH.HRMSS175_01

Date: Day: Time: Resolution:

11/29/10 Mon 08:25 Restarted by Janice.

Error log and follow up comments:

08:25:35 1627 utl_file.fclose(out_file1);

08:25:35 1628 utl_file.fclose(out_file2);

08:25:35 1629

08:25:35 1630 If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

08:25:35 1631 DBMS_OUTPUT.PUT_LINE('###################################################');

08:25:35 1632 DBMS_OUTPUT.PUT_LINE('##### #####');

08:25:35 1633 DBMS_OUTPUT.PUT_LINE('#### ####');

08:25:35 1634 DBMS_OUTPUT.PUT_LINE('### KFS FILE IS OUT OF BALANCE ###');

08:25:35 1635 DBMS_OUTPUT.PUT_LINE('#### ####');

08:25:35 1636 DBMS_OUTPUT.PUT_LINE('##### #####');

08:25:35 1637 DBMS_OUTPUT.PUT_LINE('###################################################');

08:25:35 1638 -- RAISE kfs_not_balanced;

08:25:35 1639 End if;

08:25:35 1640

08:25:35 1641 DBMS_OUTPUT.PUT_LINE('.');

08:25:35 1642 DBMS_OUTPUT.PUT_LINE

This appeared to be a timing issue. HRMSS175 did **NOT** fail with an out of balance error. The section of the joblog included in this email is simply a section of the sql logic, but this logic was not invoked by HRMSS175 during this execution. In fact, HRMSS175 did not fail at all but rather the SUCCESS.HRMSS175 logic detected that the utl_file2 (.recon file) was empty and consequently set the hrmss175_status_{chain_id} subvar accordingly:

+ hrmss175_status=ERROR - HRMSS175 output file(s) empty/missing [[ ERROR

+ - HRMSS175 output file(s) empty/missing != SUCCESSFUL ]] print \n***

+ HRMSS175 NOT SUCCESSFUL - ABORT \n***

+ 1> /ais01/dat/work/prod/HRMSKFS_QPH.HRMSS175_01.Job_Error_Summary

+ print ERROR - HRMSS175 output file(s) empty/missing

+ 1>> /ais01/dat/work/prod/HRMSKFS_QPH.HRMSS175_01.Job_Error_Summary

+ [[ -s ]]

+ echo END OF SUCCESS.HRMSS175

END OF SUCCESS.HRMSS175

The HRMSS175 chain component actually goes into ABORT status when the after condition checks the value in {#hrmss175_status_{chain_id}} and aborts the task if {#hrmss175_status_{chain_id}} not equal SUCCESSFUL.

So, there was a timing issue because the output utl_file2 was actually not empty -- but at the time that the SUCCESS.HRMSS175 script checked, it was empty. Should this problem recur frequently, we may want to include a slight delay within the SUCCESS.HRMSS175 script (via a sleep command) to attempt to eliminate timing issues.

I restarted the failed HRMSS175 and it completed okay.

Janice.

Aborted Module Name: AROSDGLI.AROSS167

Date: Day: Time: Resolution:

11/30/10 Tue 21:00 See correct error from David below.

Error log and follow up comments:

*** /appworx/out/FAIDINST_NW.LYNX_01.status.txt ***

URL=http://wsprod.colostate.edu/cwis231/onet/autorun/triple_crown_schols.aspx (GET)

STATUS=HTTP/1.1 200 OK

URL=http://wsprod.colostate.edu/cwis231/onet/autorun/triple_crown_schols.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

Here is some more information regarding the FAIDINST.LNYX_01 failure:

</head>^M

<body bgcolor="white">^M

<span><H1>Server Error in '/CWIS231/onet' Application.<hr width=100% size=1 color=silver></H1>^M ^M

<h2> <i>ORA-20100: ::Cannot create, record already exists::<br>ORA-06512: at "BANINST1.CSUG_API_GL BEXTR", line 287<br>ORA-06512: at line 1<br></i> </h2></span>^M ^M

<font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">^M ^M

<b> Description: </b>An unhandled exception occurred during the execution of the current web request. P lease review the stack trace for more information about the error and where it originated in the code.^M ^M

<br><br>^M

<b> Exception Details: </b>System.Data.OracleClient.OracleException: ORA-20100: ::Cannot create, record already exists::<br>ORA-06512: at "BANINST1.CSUG_API_GLBEXTR", line 287<br>ORA-06512: at line 1<br><br><

br>^M

<b>Source Error:</b> <br><br>^M ^M

David.

Aborted Module Name: INTLDALY.INTLS005_01

Date: Day: Time: Resolution:

09/14/11 Wed 19:08 Restarted by Joleen.

Error log and follow up comments:

ERROR at line 1:

ORA-20001: Bad data for PIDM: 10643397-1843 -ERROR- ORA-01843: not a valid month

ORA-06512: at line 292

The bad date was corrected so the INTLS005 module can be run again.

Peter.

Aborted Module Name: AREGORFP_FA.SEND_MAIL_01

Date: Day: Time: Resolution:

12/08/10 Wed 13:06 See follow up from Vicki.

Error log and follow up comments:

#------------------------------------------------------------------------------

# # ADDRESS FILE [to:/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST]

#******************************************************************************

# FATAL : < main::validate_address

# FATAL : Error opening address file (/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST) : A file or directory in the path name does not exist.

#------------------------------------------------------------------------------

# [ 2010.12.08-13:06:54 ]

# RETURN CODE = 100

#==============================================================================

error is 100

===== Exiting PERL_CSU =====

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

The thing that jumps out to me is ADMSS425. This chain is not executing Admission information.

# ADDRESS FILE [to:/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST]

Jerry Becker wrote the following:

The distribution list the chain should send the report (comma delimited file) to is ro_rpt_processing@mail.colostate.edu.

Vicki.

Aborted Module Name: ADMSSRLD_FR.SQLSURLOAD-LOOP_01

Date: Day: Time: Resolution:

12/30/10 Thu 22:49 See follow up below.

Error log and follow up comments:

+ rm -ef /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.surload_driver /ais01/dat/work/prod/d5575457

rm: removing /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.surload_driver

rm: removing /ais01/dat/work/prod/d5575457

+ read this_sqlsurload_letter

+ let iteration_count=5+1

+ (( 6 < 10 ))

+ iteration_cnt=06

+ grep END SQL FOR LETTER_CODE=AGEN_AGEN_PTRFA /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.DAT

+ print END SQL line not found on driver for AGEN_AGEN_PTRFA - abort this module

END SQL line not found on driver for AGEN_AGEN_PTRFA - abort this module

+ exit 1

+ err=1

I have corrected the error in the driver file so that we're good to go next time. I'll talk to Joe about whether or not we want to run this Monday morning.

Kathy.

I modified the “work” version of the driver with the corresponding change

--* BEGIN SQL FOR LETTER_CODE=AGEN_AGEN_PTRFA

--* BEGIN SQL FOR LETTER_CODE=AGEN_PTRFA

so that the failed SQLSURLOAD-LOOP component could be restarted, picking up where it left off, thereby allowing for completion of the ADMS schedule today instead of waiting until Monday J

Janice.

Thanks, Janice. I just corrected it in the file on Kebler...but I also made another change at the end of the file. We had two identical pieces of SQL for two different letter codes so I removed the duplicate code and added the letter code to the first one, separated with a comma:

--* BEGIN SQL FOR LETTER_CODE=AENR_LHPFR,ASPY_LHPFR

--* END SQL FOR LETTER_CODE=AENR_LHPFR,ASPY_LHPFR

I'm just letting you know, in case you notice the difference. It won't matter if we use the work version that has the two separate sql's because I added the ASPY_LHPFR manually to Banner this morning. Since the AENR_LHPFR code did run and generate letters, I wanted to make sure they stayed in sync. Since the letters are already out there, the code that runs shouldn't pick up anyone new to add. But I'll check it later just to make sure.

Sorry for the error. I did a lot of cut-and-pasting yesterday and even though I checked it several times for typos like this, I still missed it. :-( Sorry to make you work on a holiday!

Kathy.

When I did a “diff” between the files, the only change it detected was the one I noted – so, that’s the only one I changed in the “work” version of the driver which was being used by this step. Guess you must have slipped this change in after I looked!

Your schedule is progressing now – only ADMSLETS and ADMSEMAL left running.

Have a Happy New Year!

Janice.

Aborted Module Name: KFSXGLPO_D1.KFSX_JAVA_02

Date: Day: Time: Resolution:

09/29/11 Thu 20:02 See follow up below.

Error log and follow up comments:

^{at
$Proxy210.post(Unknown Source)}

^{at
org.kuali.kfs.gl.batch.service.impl.PosterServiceImpl.postTransaction(PosterServiceImpl.java:433)}

^{... 46 more}

^{Caused by:}

^{java.sql.SQLException:
ORA-12899: value too large for column
"KFSUSER"."GL_ENTRY_T"."TRN_LDGR_ENTR_DESC"
(actual: 41, maximum: 40)}

^{at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)}

Dermot called having problems re-starting KFSXGLPO_D1.KFSX_JAVA_02. The chain was aborting by design.

The chain expects the re-start file to be located as /ais02/app/kfs/prd/work/staging/gl/originEntry/gl_sortpos

t.restart.data. I copied the backup of the gl_sortpost.data to the appropriate file name. I used the ABORT Chain notes as a guideline to do this. Then John was able to fix the bad records and KFSXGLPO_D1.KFSX_JAVA_02 was re-started and completed successfully.

David.

Poster bombed for at least the 3^rd time for the exact same problem. I was disappointed that we needed to involve David last night. I have updated the spreadsheet, but I’d like to revisit the problem, so that we document it thoroughly. I truly believe that we do not need to involve Janice or David P next time. Let’s discuss this next week.

John.

If this sernario happens again then the following step needs to be taken:

COPY

/ais01/bkp/{#1}.KFSX_JAVA_01.gl_sortpost.data.{#apmx_now_yyyymmdd_time}.bkp

/ais02/app/kfs/prd/work/staging/gl/originEntry/gl_sortpost.restart.data

Then restart ABORTED module when odd characters are removed.

Aborted Module Name: AREGDYWL_SP.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

01/09/11 Sun 23:02 Restarted by Steve.

01/06/13 Sun 00:02 Restarted by Joleen.

Error log and follow up comments:

01/09/11.

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print AREGDYWL_SP.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu on port: 7980

+ exit 4 error is 4

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

Please proceed by checking the spawned output joblog (AREGDYWL_SP.VPLUS_RCAPTURE_01). If the problem there is the "cannot connect to Host" error message, which I suspect it will be, then simply restart the VPLUS_RCAP-LOOP component as described from this older email dated back in October.

Janice.

01/06/2013.

+ egrep ABORTED|CRITFAIL|C-Error

9643375.00 BATCH AREGDYWL_SP.VPLUS_RC01/05 23:01

00:00:02 ABORTED

AREGDYWL_STOP_WAIT_LIST_NOTIFY

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

There are no conditions. Looked this up in Dermots ABORT log. This Process Flow has aborted before with the same error, there was a note from Janice-

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

I restarted this job and it has finished running.

Joleen.

Aborted Module Name: AROSMSTM.AROSS303_01

Date: Day: Time: Resolution:

01/15/11 Sat 10:00 See note from Janice & Josh below.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

ORA-03114: not connected to ORACLE

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

I restarted AROSMSTM.AROSS303_01, but have placed the next component (AROSMSTM.AROSS305_01) on hold so results of AROSS303 can be verified before we allow the remainder of the chain to proceed.

IT Scheduling:

Please monitor AROSMSTM.AROSS303_01 and reply all to this email regarding AROSS303 completion/failure.

There is a problem with the statement generation program, coupled with the massive number of transactions to be processed. We think statement generation, using the existing process, could take DAYS! Josh is working on a re-write of statement generation - a project that has been in the works for a while, but due to the current performance issue has now become a high priority project (like we need it yesterday!).

Josh may be able to provide additional info regarding estimated timeframe.

Janice.

We are working to implement a new statement process.

We are hoping to have statements printing as soon as possible which will probably be tomorrow or Friday.

I will give you more information when I have it.

Josh.

Aborted Module Name: AREGDYDL.FTPS_CURL_01

Date: Day: Time: Resolution:

01/18/11 Tue 18:05 Restarted by ITS.

Error log and follow up comments:

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# 2011.01.18-18:05:08 : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

error is 100

===== Exiting SCRIPT_MS_CSU =====

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

Jerry is going to reset the password on the State Driver's License side.

Can anyone tell me the name of the job that sends Jerry a monthly email reminder to reset the password. He does not remember receiving one.

Once Jerry has reset the password we can continue with this job

Vicki.

Can you add ro_aries_security@mail.colostate.edu to the distribution list? I think what happened is the email went to the CODoR to reset our password, but they don't know what our password is. Matt, Denise or I may have to follow-up this email with another email or phone call containing our password. Or can Appworx be set-up to send a second email following this one containing just the current password?

I think there should also be one that goes out quarterly to change our password - would you mind checking to make sure ro_aries_security@mail.colostate.edu is included in that one as well? I think I've received this one before so it's probably ok...Jerry.

I added ro_aries_security@mail.colostate.edu to the Email that runs on the 10th of each month. The quarterly Email has this address also, and I also remember receiving this one, so I think it is okay. I deleted AREGDYWL for today per Jerry Becker.

David.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

01/20/11 Thu 00:26 See note from Janice & Kevin below.

Error log and follow up comments:

Reactivate Employee: Williamson,Mathew Thomas 49167 mtw Student Hourly Employee Employee

Reactivate Employee: Willson,Kendra Dawn 39767 kdsaine Graduate Assistant Employee

**ERROR update prncpl_id=42394 dwilson ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

Reactivate Employee: Wilson,Grace-Lyn Liberato 31600 wilikona Administrative Professional Employee

Reactivate Employee: Wolf-Ringwall,Amber Lee 10868 awolf51 Student Hourly Employee Employee

+------------------------------------------------------------+

| EMPLOYEE INFORMATION JOB SUMMARY |

+------------------------------------------------------------+

+ [ -f login.2453894 ]

+ rm login.2453894

+ print *** \n*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE

+ FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_0

+ 1> 1_20_0026_sql_followup

+ egrep -v -f /ais01/dat/misc/prod/errstrg_sql_ORA_ok

+ 1>> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_

+ 1>> 01_20_0026_sql_followup

+ egrep -f /ais01/dat/misc/prod/errstrg_sql

+ /appworx/out/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_01_20_0026.A

+ WPROD.LOG print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS

+ \n***

+ 1>> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_

+ 1>> 01_20_0026_sql_followup

+ cat

+ /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_01_2

+ 0_0026_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

**ERROR update prncpl_id=42394 dwilson ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

***

I will have BFS update the person manually. This chain can be completed.

Kevin.

To proceed with the chain, delete the failed KFSXCS52.KFSXS007_01 component.

Janice.

Aborted Module Name: HRMSS241.HRMSS241_01

Date: Day: Time: Resolution:

01/21/11 Fri 20:32 See follow up below.

Error log and follow up comments:

20:32:35 792 dbms_output.put_line(' ');

20:32:35 793 dbms_output.put_line('ERROR - Carrier ID and/or Coverage code could not be identified for plan '||v_plan||' option '||v_option);

20:32:35 794

20:32:35 808 dbms_output.put_line('ERROR - An unexpected error has occurred.');

20:32:35 809 dbms_output.put_line(sqlerrm);

20:32:35 810 dbms_output.put_line('All changes have been rolled back. Fix problem and run the process again before transmitting files to the vendor

ERROR - Carrier ID and/or Coverage code could not be identified for plan Green

20:32:35 793 dbms_output.put_line('ERROR - Carrier ID and/or Coverage code could not be identified for plan '||v_plan||' option '||v_option);

20:32:35 794 dbms_output.put_line('All changes have been rolled back. Fix problem and run the process again before transmitting files to the vendor.');

20:32:35 795

20:32:35 796 raise;

ERROR at line 1:

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 796

This problem has been taken care of. Do not restart it. When the process runs next week it will pick up any missed transactions.

Stevie G.

So, the plan is to not send any files from this chain until next week then?

Janice.

Delete the module. We will not be sending them files this week.

Stevie G.

If we are not to send them files this week, then we need to delete the chain. If we only delete the module, the remainder of the components in the chain will run, which would send files to the vendor.

Janice.

Aborted Module Name: AREGDYTR.SQLLOAD-LOOP_01

Date: Day: Time: Resolution:

01/24/11 Mon 07:19 See follow up below.

Error log and follow up comments:

value used for ROWS parameter changed from 64 to 9 Record 50: Rejected - Error on table CSUBAN.SWLTNSC, column REQ_SSN.

ORA-12899: value too large for column "CSUBAN"."SWLTNSC"."REQ_SSN" (actual: 21, maximum: 9)

Table CSUBAN.SWLTNSC:

160 Rows successfully loaded.

1 Row not loaded due to data errors.

0 Rows not loaded because all WHEN clauses were failed.

0 Rows not loaded because all fields were null.

Space allocated for bind array: 239166 bytes(9 rows)

Read buffer bytes: 1048576

Total logical records skipped: 0

Total logical records read: 161

Total logical records rejected: 1

Total logical records discarded: 0

Run began on Mon Jan 24 07:19:23 2011

Run ended on Mon Jan 24 07:19:24 2011

Elapsed time was: 00:00:01.56

CPU time was: 00:00:00.04

Please restart AREGDYTR.SQLLOAD-LOOP_01, which is currently in LOADFAIL status. The data file has been corrected by removing special characters. I don’t know if there’s an easy way for IT Scheduling to remember this – but it would be helpful to copy Josh on AREGDYTR chain failure messages since he is one of the “experts” for Transcripts processing and often resolves the problem. He is not in the AGEN AREG alert list though (and probably shouldn’t be), so not sure how that should be handled.

Janice.

Aborted Module Name: EIDSUPDT.HRMSS111_01

Date: Day: Time: Resolution:

01/31/11 Mon 22:45 Restarted by ITS.

02/01/11 Tue 08:23 Restarted by ITS.

02/01/11 Tue 23:01 Restarted by ITS.

Error log and follow up comments:

01/31/11.

829492000 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

The following person has duplicate eID email addresses. I just sent an email to Randy Miotke asking if one of the "duplicate" records could be removed.

focklera 1218514

823677933 allison.fockler@rams.colostate.edu;

Vicki.

It appears that other than for the problem person, HRMSS111 was successful, updating 32 records (see HRMSS111 JOB SUMMARY output below). It was in our search of the sql output that the "ORA-" message was detected, thereby forcing the component to fail. Would it be okay to just delete the failed HRMSS111 for today or do you want to rerun it after Randy Miotke deals with the duplicate?

Janice.

The data has been fixed. Please restart HRMSS111.

Vicki.

02/01/11 @ 08:23. & @ 23:01 (with same error as above).

Data has been fixed. Please start EIDSUPDT.HRMSS111_01 again

Rami.

CSU ID 829312674 Hayley Templeton-Norris

appears to have 2 primary eIDs - hayely, hayleytn and therefore this person has 2 primary email addresses that are we are trying to assign to one employee.

Randy, Can you please take a look at this and let us know what should be done about it? Vikki.

I called Sue Coulson and found that this person’s record state was the result of a merge in Admissions. I’ve the demoted the secondary eID “hayely” to a secondary. So the job should now be able to run.

Joe, I need to change the First.Last alias to reflect Hayley’s correct last name. It’s currently Hayley.Templeton-norris@rams.colostate.edu and should reflect the modified (correct) last name Hayley.Templeton@rams.colostate.edu. You may want to have someone notify Haley.

Also, if a record does need to be merged, please have the processor check to be sure that two eIDs don’t exist for the person. If they do, I need to do some work on my side so this condition is avoided in the eID data. Let me know if there are questions. Randy.

We can restart HRMSS111 in EIDSUPDT.

Vicki.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

02/01/11 Tue 00:12 See note from Kevin below.

08/11/11 Thu 03:06 Deleted by Dermot.

Error log and follow up comments:

02/01/11.

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

**ERROR update prncpl_id=4677 swaps ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

*** Error: 29540 ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

I will relay the errors (2) on to BFS. This chain can be completed/deleted.

Kevin.

08/11/11.

ORA-20000: ORU-10027: buffer overflow, limit of 1000000 bytes

ORA-06512: at "SYS.DBMS_OUTPUT", line 32

ORA-06512: at "SYS.DBMS_OUTPUT", line 97

ORA-06512: at "SYS.DBMS_OUTPUT", line 112

The buffer overflowed, it appears that there were several people (12000 +) that were set the be inactivated.

I want to make sure that is correct before we restart, that seems very excessive.

I will be looking at this and will let you know when we have more details.

Josh.

Aborted Module Name: FAIDTSWF_TUITION_SCHOLR_WKFLO

Date: Day: Time: Resolution:

02/03/11 Thu 09:46 See follow up below.

Error log and follow up comments:

“Empty value for prompt not allowed”.

I tried to request the FAIDTSWF chain in for Candy and I got this message (see attached).

Please advise.

David fixed the prompt on this chain and I ran it. (Thanks DavidJ)

I called Candy Chapman to verify that it is OK and she was out. I left a message.

Joleen.

The prompt value was empty, David inserted the value below into the prompt value field and the chain finished successfully.

Dermot.

{#FAID_AID_YEAR}

Aborted Module Name: ADMSSCOR.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

02/07/11 Mon 06:04 Restarted by ITS.

03/05/13 Tue 06:02 See note from Elden below.

Error log and follow up comments:

02/07/11.

+ 1> /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/ADMSSCOR.VPLUS_RCAP-LOOP_01.5747767.5747774.00.2011_02_07

+ _0604.AWPROD.LOG rm -ef

+ /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print ADMSSCOR.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

Failed because VistaPlus Prod is not up.

Greg/Rich have indicated that we need a DBA to fix the above issue with the databases.

Rich just informed me that VistaPlus Prod is back up – IT Scheduling may restart all the failed VPLUS_RCAP-LOOP components

Janice.

03/05/13.

+ awexe upd_var_value subvar=#vplus_rcap_iterations_10000457 var_value= flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

It looks like the /appworx/csu/exec/VPLUS_RCAP-LOOP.KSH script is inconsistently using iteration_count and iteration_cnt – it may be that the upgrade now does not allow an empty var_value?

Elden.

Aborted Module Name: KFSXAPEI.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

03/05/13 Tue 05:32 See follow up below.

Error log and follow up comments:

+ cat /ais01/spool/vplus/out/KFSXAPEI.10000382.txt

+ 1> /ais01/spool/vplus/out/KFSXAPEI.txt.BKP

+ [[ N = Y ]]

+ awexe upd_var_value subvar=#vplus_rcap_iterations_10000382 var_value= flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

It looks like the /appworx/csu/exec/VPLUS_RCAP-LOOP.KSH script is inconsistently using iteration_count and iteration_cnt – it may be that the upgrade now does not allow an empty var_value?

The actual error may be due to "#vplus_rcap_iterations_10000382" being longer than allowed – it's 31 characters long including the leading "#". Our chain ID's have just gone over the 10000000 mark. Maybe we can shorten the vplus_rcap_iterations_... to vplus_rcap_iter_ ?

I have a modified version of the script I will test shortly.

Elden.

Aborted Module Name: HRMSWKYD.HRMSS166_01

Date: Day: Time: Resolution:

02/07/11 Mon 10:03 Restarted by ITS.

Error log and follow up comments:

10:03:53 44 ,'Yes' login_found_banr

10:03:53 45 ,account_status banr_account_status

10:03:53 46 from jobprd.csu_all_users@banprod

10:03:53 47 where account_status in ('OPEN','EXPIRED')

10:03:53 48 ) banr,

It appears that there was a problem with a database link. Go ahead and restart the job.

Steve. H.

I reset HRMSWKYD.HRMSS166_01 and it Aborted with the following error:

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 46:

ORA-04052: error occurred when looking up remote object

ORA-00604: error occurred at recursive SQL level 1

ORA-12519: TNS:no appropriate service handler found

Robin.

HRMSWKYD.HRMSS166_01 is failing with the "ORA-12519: TNS:no appropriate service handler found" error message which Mark mentioned in last night's news file. HRMSS166 runs with login of jobprd@hrprod.

Janice.

Should be fixed now.

Mark B.

Aborted Module Name: AGENWYMD.AGENS021_01

Date: Day: Time: Resolution:

02/08/11 Tue 00:25 Restarted by ITS.

Error log and follow up comments:

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 256

00:05:35 255 begin <<get_address>>

00:05:35 256 select substr(spraddr_street_line1,1,10), substr(spraddr_zip,1,5), spraddr_city

00:05:35 257 into v_address_line, v_zip, v_city

We have a problem in AGENS021 – The Weekly GP Edits. SPRADDR_CITY is bigger than the v_city variable (20 characters). We need to modify the program appropriately and will let you know when to try again.

Vicki.

AGENS021.sql has been updated and copied to temp KEBLER.

Rami.

Please restart AGENWYMD.AGENS021_01.

Vicki.

Aborted Module Name: AROSBURS_FT.SSH_SFTP_01

Date: Day: Time: Resolution:

02/09/11 Wed 05:14 Restarted by ITS.

Error log and follow up comments:

# > ssh: connect to host bfsapp1.acns.colostate.edu port 22: A remote host did not respond within the timeout period.

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

Please contact the Bursar's Office (BFS_Bursar@mail.colostate.edu) to determine when their server, bfsapp1.acns.colostate.edu, will be available.

Janice.

It appears that bfsapp1.acns.colostate.edu port 22 is unavailable, can you please let us know when this is back up and we can restart AROSBURS_FT_TRANSFER_TO_BURSAR which failed earlier.

Dermot.

The bfsapp1 server crashed last night. We are in the process of repairing it, but it may take a while.

The bfsapp1 server is back up and running. You should be able to run the AROSBURS_FT_TRANSFER_TO_BURSAR process now.

Mike G.

Does IT Scheduling want to add some Chain Abort notes to the AROSBURS_FT_TRANSFER_TO_BURSAR chain documenting the "contact Bursar's office to determine when their server, bfsapp1.acns.colostate.edu, will be available" action to be taken if the SSH_SFTP failure is related to an unresponsive remote host?

Example of error message-

ssh: connect to host bfsapp1.acns.colostate.edu port 22: A remote host did not respond within the timeout period.

Similar Chain Abort notes could also be added to:

HRMSBURS_FT_TRANSFER_TO_BURSAR

KFSXBURS_FT_TRANSFER_TO_BURSAR

Of course, if the SSH_SFTP failure is unrelated to remote host connection problems, then the Bursar's office would not be the first point of contact for troubleshooting.

Thanks,

Janice.

AROS Abort notes updated by Dermot, KFSX by Robin & HRMS by James.

Aborted Module Name: HRMSWKSP_01.HRMSS018_01

Date: Day: Time: Resolution:

02/14/11 Mon 22:17 See follow up below.

Error log and follow up comments:

ORA-20006: Error getting session dates ORA-20003: Error getting WORK_STUDY_AY_END_DATE ORA-01403: no data found

ORA-06512: at line 537

22:17:46 535 dbms_output.put_line(sqlerrm);

22:17:46 536 rollback;

22:17:46 537 raise;

22:17:46 538 END;

22:17:46 539

Grantham,Justin Daniel 10767901 26-Feb-2010 201010 MWSA 30.51

ORA-20006: Error getting session dates ORA-20003: Error getting WORK_STUDY_AY_END_DATE ORA-01403: no data found declare

The error occurs on "Kinsell,Heidi M" when it's going to find her fall end date:

select to_date(substr(global_value, 1, 10),'YYYY/MM/DD')

into p_fall_end_date

from ff_globals_f

where global_name = 'WORK_STUDY_FLL_END_DATE'

and p_current_date between effective_start_date and effective_end_date;

The problem is that the p_current_date that is being passed in is '01-APR-2005'.

The only records that exists in p_fall_end_date where global_name = 'WORK_STUDY_FLL_END_DATE' have the dates of:

22-DEC-06

21-DEC-07

19-DEC-08

18-DEC-09

17-DEC-10

The question is how to fix this problem?.............-Bob-

I just talked with the HR team and the decision is that additional follow-up will need to take place tomorrow for this failure. Therefore, to allow last night's HRMSAW99 to complete, I added the HRMSWKSP chain prefix to the /ais01/dat/work/prod/HRMSAW99.WAIT_FOR_CHAINS_01.DAT "exclusions" file. This will allow HRMSAW99 to complete the next time it wakes up (in just a few minutes) -- thereby creating the notify file which HRMSAW01 requires in order to allow the tonight's HRMS nightly schedule to proceed.

HRMSWKSP_01.HRMSS018_01 will remain in ABORTED status until tomorrow's follow-up has occurred…..Janice.

Oh.. I see that HRMSAW00 staging was not delayed so now the HRMSAW99 from yesterday is detecting the HRMS chains which were staged in. IT Scheduling will need to update 02/14 to #HRMSAW99_EXCLUDE_DATE subvar ASAP to allow HRMSAW99 to complete.

Janice.

I have researched this problem and it doesn't seem to make any sense. The error message is ORA-12899: value too large for column "PSP"."PSP_ENC_LINES"."CREATED_BY" but I reviewed the script and there are no updates to the created_by column on the psp_enc_lines table. I also tested the script and it ran successfully without any errors. I reviewed the script notes and it doesn't appear that data has changed since the procedure is simply to run the entire chain again.

We may want to run this chain sometime tomorrow during the day and see if it completes successfully. If it does then tomorrow night's run should be successful. If it fails again then we can take a look at it

Steve H.

Not sure if we're close on a solution or not - but since it is so late already, IT Scheduling should go ahead and add HRMSWKSP chain prefix to the /ais01/dat/work/prod/HRMSAW99.WAIT_FOR_CHAINS_01.DAT "exclusions" file again like we did yesterday to allow HRMSAW99 to complete. Steve Hill just informed me that he has created a temp HRMSS018.sql to solve the HRMSWKSP problem. Please restart the aborted HRMSWKSP_01.HRMSS018_01 component ASAP.

Janice.

Aborted Module Name: AREGDYAD.WEB_API_01

Date: Day: Time: Resolution:

02/15/11 Tue 02:54 Restarted by Janice.

08/14/13 Wed 07:16 Restarted by Joleen.

Error log and follow up comments:

02/15/11.

# 20110215-025502 : *** FATAL *** | Base_API::_exit < ScriptInterface::_exit : stack

# 20110215-025502 : *** FATAL *** | [00] [/appworx/csu/exec/Base_API.pm:000788] Base_API::call_stack #=1 @=1 < Base_API

# 20110215-025502 : *** FATAL *** | [01] [/appworx/csu/exec/WEB_API.PL:000198] Base_API::_exit #=1 @=0 < ScriptInterface

# 20110215-025502 : *** FATAL *** | [02] [/appworx/csu/exec/WEB_API.PL:001281] ScriptInterface::_exit #=1 @=0 < ScriptInterface

# 20110215-025502 : *** FATAL *** | [03] [/appworx/csu/exec/WEB_API.PL:001590] ScriptInterface::post_file #=1 @=0 < Main

# 20110215-025502 : *** FATAL *** | [04] [/appworx/csu/exec/WEB_API.PL:001461] Main::main #=1 @=0 < Main

# 20110215-025502 : *** FATAL *** | ****************************************************************************************************

# 20110215-025502 : *** FATAL *** | Post Error

# 20110215-025502 : *** FATAL *** | ****************************************************************************************************

Error was:

# 20110216-034811 : [UAResponse] | 500 Can't connect to apps.gradesfirst.com:443 (Bad hostname 'apps.gradesfirst.com')

Changed "apps.gradesfirst.com" to "app.gradesfirst.com " and restarted - completed successfully.

There was a problem with the URL prompt value - I correct this value and restarted - completed successfully.

Like change has been made to the chain definition.

Janice.

08/14/13.

SSL negotiation failed: at /usr/opt/perl5/lib/site_perl/5.8.8/LWP/Protocol/http.pm line 31

I tried restarting and got the same error.

AREGDYAD is the Daily Athletic Extract. This is sending data to Grades First, I believe.

I would suggest trying the WEB_API again and if it does not work, then let's talk with Elden.

I called GradesFirst this morning (800)745-5180 They said that they had changed their SSL Certificate and that is what is causing us to not be able to send our updated data to them.

They are going to send an email with instructions. We will follow up after we receive that information.

Vicki.

Aborted Module Name: HRMSENCD.HRMSS074_01

Date: Day: Time: Resolution:

02/15/11 Tue 20:47 See note from David below.

Error log and follow up comments:

HRMSENCD.HRMSS074_01 failed with the below error:

20:47:40 82 update psp_enc_lines pel

20:47:40 83 set encumbrance_amount = 0

20:47:40 84 where ENC_LINE_ID = X.ENC_LINE_ID;

20:47:40 85 v_count1 := v_count1 + 1;

20:47:40 86 Else

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE

FOLLOWING:

***

ERROR at line 1:

ORA-12899: value too large for

column "PSP"."PSP_ENC_LINES"."CREATED_BY"

ORA-06512: at line 82

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

I contacted Steve Hill, who instructed me to skip the HRMSENCD chain for tonight. I followed the below instructions from the Chain Notes to skip HRMSENCD:

*****NOTE: If it is determined that the the problem will not be fixed - i.e. HRMSENCD_DAILY_ENCUMBRANCES chain will be skipped and follow-up will be done the next day, then proceed as follows:

1) Delete the HRMSENCD_DAILY_ENCUMBRANCES chain from backlog.

2) Delete the HRMSAW14_ENCUMBRANCES_DONE chain from backlog.

This will allow downstream dependent chain (such as rest of HR update schedule to proceed), but skip the HRMSAW14 chain components which would have created notify files for KFSX encumbrance related processing. Consequently, KFSX related encumbrance processes will be skipped

David.

Aborted Module Name: KFSXSYPG.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/16/11 Wed 19:01 See follow up below.

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

Several KFSX jobs have failed with "ERROR: Malta SCRIPT ABORTED - EXIT CODE=1". I contacted Kevin who will login to check the scripts. He also suggested calling Shawn to check out Malta. Shawn is also looking.

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

Last night's problem with KFSX Java not only caused a stall/delay in the nightly KFSX production schedule, but many of the nightly processes actually did not run due to the automated cancel and proceed feature (CHAIN_CANCEL) within these chains (see below). While this automated feature is designed for the occasionally specific failure within these chains, the unfortunate side-effect is that when a global process (such as the kfsa_java script) is malfunctioning then **many** nightly batch processes are in essence never performed. Consequently, due to the potential adverse impact, it is extremely important that any changes to global scripts are adequately tested.

We were successfully running KFS java programs via Applications Manager during the day yesterday, with the last successful execution of a KFS java program being the Clear Cache program at 16:02. The next java program, which ran around 19:00 failed immediately after executing the "unset CATALINA_OPTS" statement (within the /app/env/kfsprd_common.env script?). Between 19:00 and 19:11, nine java programs in nine different KFSX chains failed with the same error. Of those, the five chains listed below (KFSXAPAL, KFSXAPAP, KFSXAPRP, KFSXPDCA, KFSXPDGL) contained the automated cancel - so those processes did **not** run last night - i.e. they were not restarted by David because they were not in backlog. Only the failed java programs within KFSXPDFR, KFSXSYPG, KFSXCGCF, KFSXFPPC were restarted. Do we know what caused the problem? Janice.

I added the “unset CATALINA_OPTS” command to the kfstrng_common.env script yesterday as part of the work I am doing to diagnose KFS/Tomcat stoppage issues. I need that to occur between a startup and a shutdown. Before making the change, I checked the kfsprd_appworx.env script. The “appworx” script executes the “common” script and then sets CATALINA_OPTS. So, the appworx script essentially unsets CATALINA_OPTS before resetting another value.

What I suspect was happening is that the KFS jobs in AppWorx are set up to not only run the appworx script, but then follows by running the common script. Would you verify whether this is the case or not? If it is, it should not be necessary for the jobs to explicitly execute the common script as the appworx script already handles that. If it isn’t, we will need to look further. Shawn.

The Applications Manager KFSX_JAVA script has not changed since 4/20/2010 – it performs an SSH to the host machine (in this case, Malta2) and specifies the host command (in this case, the kfsx_java_ssh.ksh script) to run on the host machine. Additionally parameters are passed into the kfsx_java_ssh script, such as the java program name to be execute, the “prd” instance qualifier, etc. Again, the kfsx_java_ssh.ksh script has not changed since 04/20/2010. Within the kfsx_java_ssh.ksh script, the following command is executed:

. /app/env/kfs${batch_service_env}_appworx.env (in this case, ${batch_service_env} would evaluate to prd)

I don’t have a login to Malta (or if I do, I don’t remember the password) – but based on the echoing from our joblogs I’m guessing that it is within the kfsxprd_appworx.env script (which I believe is maintained by you/DBA’s) that the “. /app/env/kfsprd_common.env” statement would located. <</ais02/job/prod/kshexe_ssh.74>> kfsx_java_ssh.ksh prd RunBatch pcardNotificati

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env à This is in kfsx_java_ssh.ksh

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_common.env à This is NOT in kfsx_java_ssh.ksh, so must be within kfsprd_appworx.env The statement immediately following the “. /app/env/kfsprd_common.env” statement in the kfsx_java_ssh.ksh script is:

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

Which we actually never got to due to the failure on the “unset CATALINA_OPTS” command – which has be invoked somewhere directly (or indirectly) via the “. /app/env/kfsprd_common.env” statement .

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

Aborted Module Name: KFSXPDFR.KFSX_JAVA_02

Date: Day: Time: Resolution:

02/16/11 Wed 19:06 See follow up below.

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

. /app/env/kfs${batch_service_env}_appworx.env (in this case, ${batch_service_env} would evaluate to prd)

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env à This is in kfsx_java_ssh.ksh

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

Date: Day: Time: Resolution:

02/16/11 Wed 19:01 See follow up below.

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

. /app/env/kfs${batch_service_env}_appworx.env (in this case, ${batch_service_env} would evaluate to prd)

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env à This is in kfsx_java_ssh.ksh

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

Aborted Module Name: KFSXCGCF.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/16/11 Wed 19:01 See follow up below.

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

. /app/env/kfs${batch_service_env}_appworx.env (in this case, ${batch_service_env} would evaluate to prd)

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env à This is in kfsx_java_ssh.ksh

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

Aborted Module Name: AGENDYGN.AGENS006_01

Date: Day: Time: Resolution:

02/16/11 Wed 20:01 Restarted by ITS.

Error log and follow up comments:

20:01:44 1781

20:01:44 1782 gb_common.p_commit();

20:01:44 1783 utl_file.fclose(file_handle);

20:01:44 1784 exception

20:01:44 1785 when others then

20:01:44 1786 -- flush output needed for troubleshooting

20:01:44 1787 put_report_line('Error: ' || sqlerrm);

20:01:44 1788 utl_file.fflush(file_handle);

20:01:44 1789 utl_file.fclose(file_handle);

20:01:44 1790 raise; -- reraise the exception

20:01:44 1791 end;

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-20100: ::Hold from date must be less than or equal to hold to date::

ORA-06512: at line 1790

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

This person has an MA address with an end date of 01-JAN-2099

11284686 829338399 Rodriguez Marcia Stephanie

Beginning 09-FEB-11 and ending 01-JAN-99

AGENS006 is trying to add a hold and failing. Can you please check with Jeanie Breiner (who last touched this record on 16-FEB-11, yesterday) and see if we can remove the end date that is out 88 years or if perhaps the end date should be something else? When the address record has been fixed, we can restart AGENDYGN.AGENS006_01

Vicki.

Just got back from a meeting. Her address has been corrected. Please restart the job below.

Sue Coulson.

Thanks Sue for getting this person fixed. Please restart AGENDYGN.AGENS006_01 that aborted.

Vicki.

Aborted Module Name: HRMSS231.FTP_TO_SELMAN_01

Date: Day: Time: Resolution:

02/18/11 Fri 21:22 See note from Janice below.

Error log and follow up comments:

# FATAL : Unexpected error - undefined line returned

#------------------------------------------------------------------------------

# USAGE: /appworx/csu/exec/FTP_ENHANCED.PL \

# remote_host=override_host_name\

# transfer_mode=transfer_command\

# translate=translation_mode\

# src_file=fully_qualified_source_file\

# dst_file=fully_qualified_destination_file\

# site_options=comma_delimited_site_options\

# local_options=semicolon_delimited_local_options\

# transfer_mode values

# append, dir, get, put, recv, send, submit

# translate values:

# ascii, binary, ebcdic

# site_options values

# comma delimited site options

# RECFM=FBA,LRECL=133,BLKSIZE=3325

# local_options values

# semicolon delimited local options

# active | passive | cd=remote_dir

# Also, these environment variable must be set

# net_connect, db_login, db_password

#------------------------------------------------------------------------------

# exit : [ 2011.02.18-21:23:02 ] -- RETURN CODE = 100

I worked with Elden on this - seems that vendor requires SFTP - hence the FTP failure. We deleted this chain, deactivated the FTP_TO_SELMAN component and reactivated the SSH_SFTP component in the HRMSS231_HARTFORD_PORT chain and then requested the chain in to run again. SSH_SFTP successfully transferred the encrypted file to the vendor and the chain has successfully completed.

David will need to follow-up with permanent removal the FTP_TO_SELMAN component and associated module/login/etc.

Janice.

Aborted Module Name: OSYSJOBS_04.OSYSLLNK_01

Date: Day: Time: Resolution:

02/20/11 Sun 16:33 Restarted by ITS.

Error log and follow up comments:

*** PROCEED WITH EXECUTION OF SCRIPT: sys_llnk_rsh.ksh

***

<</ais02/job/prod/kshexe_rsh.70>> sys_llnk_rsh.ksh <#/ais02/job/prod/sys_llnk_rsh.ksh.23#> alias log=echo "*** " $(date +%m/%d/%Y-%T) <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_09.OSYSPURG_01.5814124.5814128.00_opmn_logs is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 1 Remote Shell errtrap_rsh parm 2 value is 1 <#errtrap_rsh.21#> [[ 1 > 0 ]] <#errtrap_rsh.21#> exit 1 <</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1 Remote Shell errtrap_rsh parm 2 value is 1 <<errtrap_rsh.3>> [[ 1 > 0 ]] <<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

I'm forwarding this email which I sent back in November regarding OSYSLLNK failures. IT Scheduling should review this email and proceed with troubleshooting today's OSYSJOBS_04.OSYSLLNK_01 as was outlined in this old email. Thanks, Janice

Subject: FW: APPWORX ABORT - OSYS

Janice.

Aborted Module Name: FAIDINST_NW.LYNX_02

Date: Day: Time: Resolution:

02/22/11 Tue 21:07 See follow up below.

Error log and follow up comments:

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog /appworx/out/FAIDINST_NW.LYNX_02.5823654.5823658.00.2011_02_22_2107.AWPROD.LOG

+ rm -ef /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

rm: Removing /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ grep FTP_

+ print FAIDINST_NW.LYNX_02

+ rm -ef /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

Tom called. He would like to restart this from the beginning.

Dawn.

Tom called. He would like to put a hold on his request below. He would like to talk to one of his developers first, then he will get back to us.

Joleen.

Let's go ahead and get ready for Tom's request by deleting the FAIDINST_SCHOLARSHIPS chain which is currently in backlog, then use the schedule procedure to schedule the FAIDINST_NW schedule with a start time in the future, but before 10:00 A.M. Then run staging to bring FAIDINST_NW into backlog and be sure to place the chain on hold right away. This will allow us get the chain scheduled on the previous virtual day, yet hold it until Tom is ready with his code changes.

When FAIDINST_SCHOLARSHIPS was staged in, the value provided for Prompt #4 (Hours ahead to be staged) was large enough that it also staged in tonight's FAIDINST_SCHOLARSHIPS (with 17:00 start time). In the future, it would be best to provide the smallest value possible for "Hours ahead to be staged" to avoid staging in more than intended/necessary. In the case with FAIDINST, last night's FAIDAW99 was detecting tonight's FAIDINST in backlog and therefore would not complete. I updated #FAIDAW99_EXCLUDE_DATE with a value of 02/23 so that FAIDAW99 could complete.

Janice.

Aborted Module Name: FAIDINST_NW.LYNX_01

Date: Day: Time: Resolution:

02/23/11 Wed 21:06 See follow up below.

Error log and follow up comments:

URL=http://wsprod.colostate.edu/cwis231/onet/autorun/partnership_schols.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

Looks like we may have a repeat of the same problem we had yesterday - any suggestions?

Tom has requested that we run FAIDINST_SCHOLARSHIPS "from the top" again like we did yesterday.

Therefore, IT Scheduling will need to:

1) Delete the FAIDINST_SCHOLARSHIPS chain which is currently in backlog

2) Using the schedule procedure, schedule the FAIDINST_NW schedule to run today with a start time in the future, but before 10:00 A.M.

3) Run staging (with a value of 1 for "Hours ahead to be staged" prompt value ) to bring FAIDINST_NW into backlog. Once the "new" FAIDINST_SCHOLARSHIPS chain for FAIDINST_NW schedule has been staged into backlog, it may be released to run right away.

Janice.

Aborted Module Name: AGENWYWP.AGENS004_01

Date: Day: Time: Resolution:

02/23/11 Wed 18:03 Restarted by ITS.

Error log and follow up comments:

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 633

18:03:07 629 --delete from csug_purge_ids where marked_flag = 'Y';

18:03:07 630 utl_file.fclose(file_handle);

18:03:07 631 utl_file.fclose(file_error);

18:03:07 632 utl_file.fclose(file_purge);

18:03:07 633 raise;

18:03:07 634 --

Where is the UTL file output that would tell us who was the last person processed?

Vicki.

CORRECT ERROR:

-29422721 11296725 -Pirge Fulton-Beale Davie Nathaniel28-JAN-11

-29436203 11298660 Purge-KROPUENSKE LEAH ROSE 19-JAN-11

-29496592 11307296 Dubosson Anne Sophie 03-FEB-11

Persons not added to purge file: 3

Please restart this chain at AGENS004. I needed to see the UTL file output which Janice helped me find in /orautl/BANPROD/chain name...It identified the person in error, Sue Coulson fixed the problem and we should be able to restart.

Vicki.

Aborted Module Name: ODBABKUP_BANDORA_DB

Date: Day: Time: Resolution:

02/24/11 Thu 16:30 Deleted by ITS.

Error log and follow up comments:

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/ODBABKUP_BANDORA_DB.5835293.5835293.00.2011_02_24_1630.AW

+ PROD.LOG rm -ef /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

rm: Removing /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

+ grep FTP_

+ print ODBABKUP_BANDORA_DB

+ rm -ef /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

Please cancel that chain, looks like KFSTRNG is still not quite right. I will fix it myself today and manually run the rest of the backups.

Mark B.

And just so everyone knows - this was Shawn's fault.

Shawn.

It's easy to forget that notify file...it's all good.

Mark B.

Aborted Module Name: FAIDTRAK_EV.LYNX-01

Date: Day: Time: Resolution:

02/25/11 Fri 04:15 Restarted by David.

Error log and follow up comments:

URL=http://wsprod.colostate.edu/cwis231/autorun/spring_only_no_spring_aprd.cfm?ay=FAIDTRAK_EV (GET)

STATUS=HTTP/1.1 503 Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

I decided to re-start this and it finished okay. Unfortunately, the schedule is way behind.

Does the STATUS=HTTP/1.1 503 Server Error mean it had trouble connecting to the web site?

David.

Aborted Module Name: AGENORGN.CHAIN_VPLUS_PS_01

Date: Day: Time: Resolution:

02/28/11 Mon 16:02 See follow up from Janice below.

Error log and follow up comments:

+ The file access permissions do not allow the specified action.

***

*** END SEARCH OF LAST JOBID(5850710.00) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS

***

I'm a bit confused/concerned regarding the course of events with the AGENORGN.CHAIN_VPLUS_PS_01 chain component abort. Although I don't see an email reporting it, this component aborted at 16:02 with the same error as listed below for the failure at 16:18. However, it appears that the 16:02 aborted AGENORGN.CHAIN_VPLUS_PS_01 was restarted by IT Scheduling and it completed at 16:03. The problem is that appropriate follow-up was NOT taken for this aborted component. As the error message indicates, it failed due to errors in the AFTER conditions of the previous jobid (i.e. the AGENS002 component). Review of the associated runhost log file for this chainid, /ais01/joblog/runhost_5850540_AWPROD.log, reveals that the error occurred when the AGENORGN.AGENS002_01 AFTER condition was trying to empty the user's data file but the group permissions on the file did not allow "write" access. The chain component includes this condition to empty the input data file to ensure that the same user data file is not processed the next time the chain runs - this is an extremely important action to avoid duplicate processing of data. Appropriate follow-up **must** always be taken to determine why an AFTER condition encountered an error and rectify the problem. In this case, appropriate follow-up should have included changing the unix permissions to allow group write access and then manually emptying the file - the easiest way to manually accomplish this is to delete the file and then recreate it as empty.

The chain was run again at 16:16 - apparently with the same user input data file. Of course, the AGENORGN.CHAIN_VPLUS_PS_01 failed again with the same problem due to the AFTER condition's failure to be able to empty the user's data file. Again the chain component was restarted - however, this time it appears that David manually performed the necessary follow-up to modify unix permissions and empty the /userfiles/Uareg/data/AGENORGN.AGENS002_01.DAT file.

Comments/questions?

Janice.

Aborted Module Name: HRMSMGMT.HRMSR053_01

Date: Day: Time: Resolution:

03/02/11 Wed 06:30 See follow up from Janice below.

03/04/11 Fri 22:16 See follow up below.

Error log and follow up comments:

03/02/11.

Craig called last night when he "killed" the long running HRMSR053. After that, it was acting weird in Applications Manager in that the chain component status was in KILLING status - I thought after a while it might change to KILLED... but not yet! I removed predecessors on the VPLUS_RCAPTURE so that we could go ahead and get the HRMSR001/1SS reports captured. Then I changed the HRMSMGMT chain so it was not single run and requested another one in... deleted the HRMSR001/001SS reports which have already run so we could try HRMSR053 again - it's been running now for a little over an hour. The daily version of HRMSR053 ran okay last night... so at least with an early morning start of HRMSR053, we'll maybe know during working hours if it will finish in 7 hours like it historically did before the upgrade. Seems odd that the daily version is running okay, but the monthly was doing who know what for 25+ hours and still didn't finish. I think we may have to have

Rich/Greg delete the HRMSR053 jobid that is stuck in "KILLING" status from the so_job_queue table in AWPROD.

Greg/Rich,

Please delete jobid 5851789 (HRMSMGMT.HRMSR053_01) from so_job_queue table - this chain component is stuck in KILLING status. I've tried deleting the entire chain, but nothing happens. No "operations" are available for 5851789 - i.e. cannot directly DELETE, KILL, RESET, etc for this chain component.

Janice.

03/04/11.

REP-0110: Unable to open file 'HRMSR053.rdf'.

REP-1070: Error while opening or saving a document.

REP-0110: Unable to open file '/app/oracle/apps/12/hrprodappl/csuh/12.0.0/reports/US/HRMSR053

The "daily" version of HRMSR053 has completed:

Started:2011-03-07 08:26:41.0,Finished:2011-03-07 13:03:05.0, Elapsed: 04:36:24

At least for the "daily" version there was no improvement in runtime, with previous post-upgrade executions of the "daily" version completing with similar, but slightly shorter, elapsed times:

3/3/11 - 4:04:50

3/2/11 - 3:57:24

2/28/11 - 4:16:48

We were however running the "daily" version during the day so perhaps the increased runtime may have something to do with hrprod being busier due to online activity/competition.

The "monthly" version of HRMSR053 has been running now for a little over 5 hours. One thing we learned last week was that the "monthly" version and "daily" version do not walk the same logic, so hopefully we will see improvement in runtime for the "monthly" version.

Janice.

I am following this program in Enterprise Manager and it not getting stuck in the sql statement that it was last week. It is still running that code, but it is doing it much faster. So, it appears to be working much better in that regard. Time will tell.

Craig P.

Aborted Module Name: HRMSSUMC_02.HRMSR053_01

Date: Day: Time: Resolution:

03/04/11 Fri 22:31 See follow up below.

Error log and follow up comments:

REP-0110: Unable to open file 'HRMSR053.rdf'.

REP-1070: Error while opening or saving a document.

REP-0110: Unable to open file '/app/oracle/apps/12/hrprodappl/csuh/12.0.0/reports/US/HRMSR053.rdf'.

Program exited with status 1

Concurrent Manager encountered an error while running Oracle*Report for your concurrent request 6641407.

This has been fixed.

-Bob-

The "daily" version of HRMSR053 has completed:

Started:2011-03-07 08:26:41.0,Finished:2011-03-07 13:03:05.0, Elapsed: 04:36:24

At least for the "daily" version there was no improvement in runtime, with previous post-upgrade executions of the "daily" version completing with similar, but slightly shorter, elapsed times:

3/3/11 - 4:04:50

3/2/11 - 3:57:24

2/28/11 - 4:16:48

We were however running the "daily" version during the day so perhaps the increased runtime may have something to do with hrprod being busier due to online activity/competition.

Janice.

Craig P.

Aborted Module Name: ADMSSCOR.LYNX_01

Date: Day: Time: Resolution:

03/11/11 Fri 06:04 Restarted by ITS.

Error log and follow up comments:

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Alert!: Unable to connect to remote host.

lynx: Can't access startfile https://wsnet.colostate.edu/ai/tools/RecruitmentPlus/TestScores.aspx

***

*** /appworx/out/ADMSSCOR.LYNX_01.status.txt ***

URL=https://wsnet.colostate.edu/ai/tools/RecruitmentPlus/TestScores.aspx (GET)

STATUS=HTTP/1.1 200 OK

***

[101] : *** ERROR Detected in Output : File Empty ***

+ err=101

The web services connectivity has been resolved.

Please restart ADMSSCOR.

Phil.

Phil was just asking me if the ADMSSCOR ran okay after it was restarted earlier this morning. I think the resolution message(s) need to be sent to all of the recipients who received the original “follow-up/troubleshooting required” email(s). IS Developers (such as Phil/Rami/Bev) who are responsible for production follow-up are not necessarily “watching” Applications Manager, so including them on the “issue resolved” emails will be helpful.

Janice.

Aborted Module Name: OSYSJOBS_06.OSYSPURG_01

Date: Day: Time: Resolution:

03/18/11 Fri 16:32 Restarted by ITS.

Error log and follow up comments:

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> find /nor_orautl/kfs4test -type f -mtime +7 -print

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> 1>> /ais02/dat/work/prod/OSYSJOBS_06.OSYSPURG_01.5945291.5945294.00_too_old

find: 0652-019 The status on /nor_orautl/kfs4test is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.87#> [[ 1 > 0 ]]

<#errtrap_rsh.87#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Empire SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Empire SCRIPT ABORTED - EXIT CODE=1

Joleen discovered that Janice’s e-mail on November 1^st instructed us to restart these if it said, “Status is not valid”. So, I restarted these.

Dawn.

I think this one might be caused by a bad link on Empire – the /orautl/kfs4test link points to /nor_orautl/kfs4test:

# ls -l | grep kfs4

lrwxrwxrwx 1 oracle dba 20 Mar 18 14:36 kfs4test@ -> /nor_orautl/kfs4test

But, /nor_orautl/kfs4test doesn’t exist:

# ls -l /nor_orautl/kfs4test

ls: 0653-341 The file /nor_orautl/kfs4test does not exist.

Rich/Greg,

Can you check this out – and delete link and/or work with dba(s) to determine if it should be there?

Janice.

Shawn renamed kfs4test to kfstest4 on Norrie’s orautl.

So the link on Empire should be ok now.

Now sure if this needs to be restarted now that the problem is fixed.

Rich.

By the way, I didn’t notice this at first… but eventually the OSYSPURG was failing because we were trying to run two OSYSPURG components at the same time for the same host. This occurred because we had two failed OSYSJOBS chains for the same schedule (OSYSJOBS_06) “backlogged” – and restarting them simultaneously resulted in stepping on each other’s toes. In the future, one of the duplicate “backlogged” chains should just be deleted (usually the oldest one), and then the failed component in the remaining chain for the schedule can be restarted, after troubleshooting. This is the course of action which I took for the last OSYSJOBS_06.OSYSPURG_01 failures this morning.

NOTE: This method is only applicable for multiple occurrences of the **SAME** schedule, not failed OSYSJOBS chains for different schedules.

Janice.

Aborted Module Name: AREGDYTR.CONVERT_PDFTOPS_01

Date: Day: Time: Resolution:

03/29/11 Tue 07:20 Restarted by ITS per Josh’s instructions.

Error log and follow up comments:

Module AREGDYTR.CONVERT_PDFTOPS_01 has aborted. No output file.

The before Condition Details indicate it is checking for a file that does not exist:

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

Here is how we need to proceed:

Go ahead and skip (delete) the CONVERT_PDFTOPS and SPOOL_FILTER_01 modules.

Then let the rest of the chain run. (AREGS604 on)

Josh.

Aborted Module Name: AGENDYHB.SRRSRIN_01

Date: Day: Time: Resolution:

03/29/11 Tue 19:04 Restarted by ITS.

Error log and follow up comments:

Username:

Password: Connected.

VOID TIMER(int timemode: 1)

RUN SEQUENCE NUMBER:

Parameter 01 HRMS Read from Job Submission Parameter 03 A Read from Job Submission Parameter 04 N Read from Job Submission Parameter 02 was not found in Job Submission Parameter 99 55 Read from Job Submission Parameter 05 was not found in Job Submission

ORA-08176: consistent read failure; rollback data not available

ORA-06512: at "BANINST1.GOKCMPK", line 914

ORA-06512: at "BANINST1.GOKCMPK", line 2327

ORA-06512: at "BANINST1.GOKCMPK", line 2717

ORA-06512: at line 1

WRN-ORACERR: Error occurred in file "srrsrin.pc" at line 2,798

WRN-ERRSTMT: Following statement was last statement parsed:

declare birth_yr varchar2 ( 4 ) := '' ; BEGIN if ( ( :birth_year is nu srrsrin terminated with error

22 lines written to /appworx/out/AGENDYHB.SRRSRIN_01.5999779.5999791.00.2242947.lis

I would like to try and just restart this step and see what happens.

Vicki.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

03/30/11 Wed 06:03 See notes below.

Error log and follow up comments:

Error Message:

2011-03-30 06:14:15,619 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXAPEI.electronicInvoiceExtractStep.6003882.6003885.00 steps: [electronicInvoiceExtractStep] 2011-03-30 06:14:15,619 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.OutOfMemoryError RunBatch ERROR: Exception found:

java.lang.OutOfMemoryError

Is it possible that we need to override the catalina memory option for this java program, increasing it from the default 1g to 2g?

I'm assuming that it will be okay to just restart this component - or is there any data cleanup necessary? If okay to restart, then IT Scheduling may proceed as follows:

1) In **BACKLOG**, provide the following value for prompt 5 "Catalina Opts Memory Override - blank to use default value of 1g, 2g for 2gb, etc." on the aborted KFSXAPEI.KFSX_JAVA_01 component:

2) Restart the aborted KFSXAPEI.KFSX_JAVA_01

The 2g value will also need to be provided as a permanent change to the KFSXAPEI.KFSX_JAVA_01 component in the chain definition.

Janice.

I had to manually removed the list of *.processed files in the Staging directory. Please restart the failed component.

John.

Did you have to remove the *.processed files because there was no commit prior to the java program failure? In other words, all the xml files which had processed prior to the failure were not committed and therefore removal of their associated .processed file(s) was necessary to ensure that these xml files would be re-processed upon restarting the aborted java program?

Janice.

Correct. We can sign into KFSPRD and check to see if PREQs (payment requests) are created by user KFS which is a system user name. I also checked for EIRT documents (Electronic Invoice Reject Document). So the eInvoice xml files are loaded and create either the PREQ or EIRT. There were no documents with Initiator of KFS, so I knew that the job rolled back the transactions, even though we had *.processed on almost all *.xml files.

John.

Aborted Module Name: AREGHRTM_SM.AREGS415_01

Date: Day: Time: Resolution:

04/04/11 Mon 09:02 See follow up below.

Error log and follow up comments:

ERROR at line 1:

ORA-29280: invalid directory path

ORA-06512: at "SYS.UTL_FILE", line 41

ORA-06512: at "SYS.UTL_FILE", line 478

ORA-06512: at line 94

Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options

This has been resolved. I made a slight change to the jobprd@banprod login because I was trying to get 'CHECK CONNECTION' feature to work for AREGRTWL chain condition logic - but it ended up causing other scripts to fail :) I switched it back and am throwing in the towel in on using 'CHECK CONNECTION' feature for AREGRTWL. For more info, see the IS NEWS email reply which I'm about to send....

Regarding the AREGRTWL_SM/SP DB ERRORS that Dawn reported:

These were related to the Banner shutdown. Since this chain uses subvars with underlying sql against BANPROD as Chain prompts, we really need to check for the /ais01/dat/misc/prod/BANPROD_shutdown_for_maint in CHAIN **BEFORE** conditions. I'm actually surprised that we didn't get a Launch error, instead of the DBERROR. At any rate, I've added the following **CHAIN** BEFORE condition to the AREGRTWL_SLEEP_WAKE_PROCESS chain:

Check for the /ais01/dat/misc/prod/BANPROD_shutdown_for_maint - if exists, CANCEL CHAIN.

I tried to also add a 'CHECK CONNECTION' CHAIN BEFORE condition to verify the jobprd@banprod oracle connection but couldn't get the 'CHECK CONNECTION' feature to work properly.

From some testing which I've done this morning, it appears that this CHAIN BEFORE condition will execute before an attempt is made to evaluate subvars associated with chain prompts.

Hopefully this condition will prevent a reoccurrence of the situation which occurred Sunday morning. Keep in mind that the AREGRTWL_SLEEP_WAKE_PROCESS chain runs every 30 minutes, 24-7. Therefore, unless the chain is failing every 30 minutes, it would be okay to delay Sunday follow-up for similar situations until the next working day because subsequent iterations would have successfully run (every 30 minutes). Next working day follow-up would consist of deleting the chain - no need to restart it since many iterations (every 30 minutes) would have already run. The same would be true for other frequency scheduled chains, such as AREGHRTM_SECTION_ENROLLMENT, which runs every hour, 24-7.

If waiting until next working day for follow-up on Sunday DBERROR(s) from frequency scheduled chains, the DBERROR pages would not continue on Sunday because the APWXCHK_BACKLOG component of the APWXCHCK_HOURLY_SYSTEM_CHECK chain (which sent that page), is skipped based on the SUNDAY_ROLLF calendar. Therefore, once the virtual day for Sunday begins at 10 A.M., then DBERROR/LAUNCH ERROR pages will **not** be sent for the remainder of Sunday (or the rolled forward "Sunday" for holidays) until 10 A.M. the next working day.

Janice.

Aborted Module Name: AROSDGLI.AROSS162_01

Date: Day: Time: Resolution:

04/29/11 Mon 01:35 Restarted by ITS.

Error log and follow up comments:

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.ROKODST", line 362

ORA-00001: unique constraint (FAISMGR.RORNCHG_INDEX_01) violated

ORA-06512: at "ODSMGR.ROKODST", line 16

ORA-06512: at line 1

ORA-06512: at "ODSMGR.GOKODST", line 69

ORA-06512: at "TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE", line 24

ORA-04088: error during execution of trigger

ORA-06512: at "BANINST1.DML_TBRACCD", line 68

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1685

ORA-06512: at line 1707

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

Since this error is based on a lock, there are no commits, the process rolled back.

Therefore I would like to request that this process be restarted.

Josh.

Aborted Module Name: AROSDRFD_RC.AROSS002_01

Date: Day: Time: Resolution:

04/29/11 Mon 13:33 See follow up below.

Error log and follow up comments:

ERROR at line 1:

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated

ORA-06512: at line 1706

ORA-06512: at line 1988

Steve located more information in the AROSDRFD_RC.AROSS200_01.utl_file1:

Current term: 201110

6 terms ago: 200910

Error when converting comm to student for pidm: 10704262 : ST :ORA-20100: ::Invalid Address code and sequence.::

******************************************************

*****Commercial accounts with default CD profile created *****

Can we proceed to the next job chain?

Jacque Clark.

I am looking into why it failed.

I would like to verify that this issue is isolated to test and will not appear in prod.

Can we wait a bit before proceeding?

Josh.

Yes. Please let me know if there is anything I need to do.

Jacque Clark.

It appears that the selection is not valid.

These definitions do not exists.

13:33:33 SQL> define AROS_SELECTION=AROSDRFD_RC_REFUND_ACH

13:33:33 SQL> define AROS_F_SELECTION=AROSDRFD_RC_EX_REFUND_ACH

It appears that the RC chain is running, I believe the prompts should be fed as follows:

AROS_SELECTION: AROSDRFD_RC_REFUND_CHECK

AROS_F_SELECTION: AROSDRFD_RC_EX_REFUND_CHECK

Then the job can be restarted.

The selection should also be changed in the TSRRFND modules as well.

I am assuming those values are set since the chain has been brought in.

Josh.

Aborted Module Name: AREGORGN.AREGS518_01

Date: Day: Time: Resolution:

05/09/11 Mon 14:39 See follow up below.

Error log and follow up comments:

11141703 828337791 Roberts, Andrew 201090 63277 MATH 261 R RD RF

11141753 828338155 Kelly, Larissa 201010 11516 BZ 110 T RD RF

11142893 828346163 Ansah-Twum, Derek 201090 62137 CHEM 111 T RD RD

ERROR at line 1:

ORA-20000: ORU-10027: buffer overflow, limit of 100000 bytes

ORA-06512: at "SYS.DBMS_OUTPUT", line 32

ORA-06512: at "SYS.DBMS_OUTPUT", line 97

ORA-06512: at "SYS.DBMS_OUTPUT", line 112

ORA-06512: at line 327

14:39:36 32 ,g.shrtckg_gmod_code GRADE_MODE

14:39:36 97 AND r.sfrstcr_pidm = d.swrgpcd_pidm

14:39:36 112 AND s.ssbsect_crse_numb = TRIM(SUBSTR(d.swrgpcd_attr10

There should be a standard for enabling output.

I think it is:

Set serveroutput on size 1000000

In AREGS518, 1000000 was set to 100000 instead.

Josh.

Aborted Module Name: HRMSS230.SSH_SFTP_01

Date: Day: Time: Resolution:

05/14/11 Sat 04:22 See follow up below + attached email.

Error log and follow up comments:

# SRC FILE : /ais01/ftp/to/user/HRMSS230.VSP.DAT

# DST FILE : g0021702@ftp.vsp.com:/prod/g0021702

# IDENTITY : /home/jobprd/.ssh/csu_to_vsp-4096-20100924

# DIR HOST :

# DIR LOCAL :

# CHMOD :

# > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> put /ais01/ftp/to/user/HRMSS230.VSP.DAT /prod/g0021702 # > Uploading /ais01/ftp/to/user/HRMSS230.VSP.DAT to /prod/g0021702 # > sftp> -ls -l /prod/g0021702 # > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> exit # > (0) # > (0)

#------------------------------------------------------------------------------

# RETURN CODE = 0

#==============================================================================

It appears that the file was uploaded to the vendor successfully, but could someone on the HR team contact the vendor to confirm this?

The job failure was due to the "Permission denied" messages when trying the "list" (-ls -l) the file on the vendor's site:

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Previous executions of this chain indicate that we have not received the "Permission denied" message when attempting to "list" the file before and/or after the upload.

**Has something changed on the vendor's side which is preventing us from performing this command - and causing this error message?

Once the confirmation has been received that file upload was successful to vendor, then this failed component may be deleted to allow the chain to complete. If the file was not transmitted successfully, then this failed component should be restarted - however, if the "permission denied" error message is produced when trying to list the file, the component will fail again

Janice.

Do you have a contact at VSP that can verify they received this file?

Steve H.

I sent out two separate emails this morning and have not yet received a response. I'm very confident that they got the file and that the ABORT was due to a permissions change on their directory.

I'll let everyone know as soon as I get a response from VSP

-Bob-

Earlier this afternoon, Bob V. and I discussed this issue and decided that if VSP did not respond before Bob leaves for the day (at 3:30 P.M.), then we would assume that the vendor received the file and delete the aborted HRMSS230.SSH_SFTP_01 component - thereby allowing remainder of the chain to complete.

Please proceed with deleting HRMSS230.SSH_SFTP_01 from backlog as soon as possible.

Janice.

Aborted Module Name: HRMSR188.SEND_MAIL_OAE_01

Date: Day: Time: Resolution:

05/16/11 Mon 22:18 Restarted by ITS.

05/17/11 Tue 07:46 Restarted by ITS.

Error log and follow up comments:

05/16/11.

# FATAL : Error opening address file (SEND_MAIL_HRSAO_BENEFITS.LST) : A file or directory in the path name does not exist.

***

Please correct the recipients prompt in backlog for HRMSR188.SEND_MAIL_OAE_01 from SEND_MAIL_HRSAO_BENEFITS.LST To /ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST

Also, followup by correcting the HRMSR188.OAE-FEEDBACK_01 chain component prompt #6 (Mailing List) value from:

SEND_MAIL_HRSAO_BENEFITS.LST

to:

{#mailst}/SEND_MAIL.HRSAO_BENEFITS.LST

I think we all missed the typo error of SEND_MAIL_ (when it should have been SEND_MAIL.), but we did previously discuss the need for the path, {#mailst}/, to be included within this prompt value.

Janice.

05/17/11.

# Passing Parms : arg=[ from="appworx@mailer.is.colostate.edu" reply_to="" to="/ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST" cc="" bcc="" subject="HRMSR188 Completed" --options=" ERROR -999 ORA-01722: invalid number" --options=""] /usr/bin/perl /appworx/csu/exec/SENDMAIL.PL from="appworx@mailer.is.colostate.edu" reply_to="" to="/ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST" cc="" bcc="" subject="HRMSR188 Completed" --options=" ERROR -999 ORA-01722: invalid number" --options=""

#==============================================================================

# [ 2011.05.17-07:46:30 ]

#******************************************************************************

# FATAL : < main::parse

# FATAL : Unknown option ( ERROR -999 ORA-01722: invalid number)

Sometimes we see this problem with the SEND_MAIL and the multiselect parameter - there's usually no easy way to fix this, as we get an AWE-9999 Internal error if we try to click on the "Select" button for the Options prompt - i.e. it's a catch-22, it doesn't like the Ref=null value, but it won't let us edit the prompt.

Since this chain simply produces a report, I suggest that we just re-run the HRMSR188 chain - ***make sure to fix the OAE-FEEDBACK prompt in the chain first***, as described in earlier email. Delete this failed HRMSR188.SEND_MAIL_OAE_01 component to let the chain complete and then request the chain to run again.

Janice.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

05/17/11 Tue 06:03 Restarted by ITS.

05/17/11 Tue 14:16 Restarted by ITS.

Error log and follow up comments:

06:03.

2011-05-17 06:09:15,881 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) ; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) RunBatch ERROR: Exception found:

org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) ; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

Can you please re-run this job. There is a single xml file that is bad and causing the abort. I have renamed this to 606788404omax1_30818MAY1611_1305630608197614015.xml_badfile so that I can take a look at the problem. However, if you re-run the job then all the other xml files should get processed.

I will make sure the corrected *.xml file gets placed into /vendorfiles/einvoice with the corresponding .done file.

John.

14:16.

* SQLException during execution of sql-statement:

* sql statement was 'INSERT INTO KRNS_NTE_T (NTE_ID,OBJ_ID,VER_NBR,RMT_OBJ_ID,AUTH_PRNCPL_ID,POST_TS,NTE_TYP_CD,TXT,PRG_CD,TPC_TXT) VALUES (?,?,?,?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.rice.kns.bo.Note'

* PK of the target object is [noteIdentifier=727543]

* Source object: note(noteIdentifier)=(727543)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

I found 4 potential files causing the error. I have renamed these so that they do not get processed.

Please re-run this job. If I have isolated the problem file, then it should finish.

4 potential bad files:

159148746_9538764508_1305626449968863369.xml

159148746_9538764516_1305626452665289181.xml

159148746_9538764524_1305626452300099449.xml

606788404omax1_25644MAY1611_1305630765656499603.xml

John.

Aborted Module Name: FAIDPACK_OD.LYNX_01

Date: Day: Time: Resolution:

05/26/11 Thu 04:15 See follow up from Janice below.

Error log and follow up comments:

STATUS=HTTP/1.1 200 OK

URL=http://wsprod.colostate.edu/cwis231/autorun/parent_tknt_email.cfm?ay=FAIDTKNT_EV (GET)

STATUS=HTTP/1.1 503 Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

+ orig_log_run=Y

+ export orig_log_run

+ log_run=Y

I forwarded this to Tom Biedscheid's group and he requested that we restart the failed component, which I've already done. According to Tom, a similar problem occurred a couple weeks ago and restarting was the solution that time :) FAIDTKNT_EV HAS COMPLETED.

Janice.

Aborted Module Name: OSYSJOBS_11.OSYSLLNK_01

Date: Day: Time: Resolution:

05/22/11 Tue 16:32 Restarted by ITS.

Error log and follow up comments:

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_09.OSYSPURG_01.6302011.6302014.00_opmn_logs is not valid.

***

*** ERROR: Guffey SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_rsh.7>> exit 1

+ grep SCRIPT ABORTED

+ /ais02/log/OSYSJOBS_11.OSYSLLNK_01.6302027.6302032.00.2011_05_22_1632.

I think this might be one of those failures where you just restart the job.

I think Janice gave information on this at one time.

I pasted the information in this email for you.

Rich.

Janice.

Aborted Module Name: HRMSWKSP_01.AROSS142_01

Date: Day: Time: Resolution:

05/23/11 Mon 22:16 Restarted by ITS.

Error log and follow up comments:

22:16:56 528 --Document the end of the program.

22:16:56 529 DBMS_OUTPUT.PUT_LINE

22:16:56 530 ('**** End of AROSS142 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

22:16:56 531 end;

22:16:56 532 /

old 5: vuser twraccd.twraccd_user%type := '&&p_user';

new 5: vuser twraccd.twraccd_user%type := 'FRANKMTZ';

old 6: vstart_date date := '&&p_start_dateDD_MON_YYYY';

new 6: vstart_date date := '06-may-2011';

old 7: vend_date date := '&&p_end_dateDD_MON_YYYY';

new 7: vend_date date := '20-may-2011';

old 9: WS_OBJECT_CODE varchar2(40) := '&&P_WS_OBJECT_CODE';

new 9: WS_OBJECT_CODE varchar2(40) := '4401';

**** Start of AROSS142 05/23/2011 22:16:57 Batch Number: WSFRANKMTZ2011050019 Account Does Not Exist: 827855071 6464040 4401 declare

ERROR at line 1:

ORA-20100: Account Does Not Exist in TBRACCT, Account B

ORA-06512: at line 266

ORA-06512: at line 499

This process is pulling data from HR and creating a TWARBUS batch to create invoices for the employers. (Work Study) This person is the problem: 827855071 6464040 4401 The account object code does not exist in TBRACCT, specifiaclly tbracct_account b. This means a detail code needs to be set up if the users want to apply transactions to this account through TWARBUS.

Generally to fix this we would contact Frank Martinez. Give him the record and ask if we can skip it to continue the schedule.

If he agrees, we should put a temp version of AROSS142 out on kebler and comment out the raise_application_error component of the error message below so that we log the problem record but don't abort the program.

Then we can move on with the schedule.

Josh.

I put a temporary version of AROSS142 in /ais01/src/sql/temp on Kebler as Josh suggested. Please continue the aborted job.

Rob.

Aborted Module Name: HRMSS230.SSH_SFTP_01

Date: Day: Time: Resolution:

05/31/11 Tue 00:30 Deleted by ITS.

Error log and follow up comments:

# > Couldn't stat remote file: Permission denied # > Couldn't stat remote file: Permission denied

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

It appears that the file was uploaded to the vendor successfully, but could someone on the HR team contact the vendor to confirm this?

The job failure was due to the "Permission denied" messages when trying the "list" (-ls -l) the file on the vendor's site:

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Previous executions of this chain indicate that we have not received the "Permission denied" message when attempting to "list" the file before and/or after the upload.

**Has something changed on the vendor's side which is preventing us from performing this command - and causing this error message?

Janice.

Bob,

Do you have a contact at VSP that can verify they received this file?

Steve.

I will contact them now and I'll keep you all posted.

I sent out two separate emails this morning and have not yet received a response. I'm very confident that they got the file and that the ABORT was due to a permissions change on their directory.

I'll let everyone know as soon as I get a response from VSP.

-Bob-

Please proceed with deleting HRMSS230.SSH_SFTP_01 from backlog as soon as possible.

The issue that occurred 2 weeks ago when HRMSS230 failed happened again last Friday night. The SSH_SFTP step will continue to fail with the permissions problems until this issue is resolved with the vendor. I suggest that the contact be made once again with the vendor to 1) confirm if they received the file from Friday night, and 2) to resolve what has changed on their end which causes the permissions problems when the SSH_SFTP process attempts to issue an "ls" on the file (on their server):

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Janice.

1) The vendor, VSP, just informed me that they did receive the file on Friday.

2) I've already sent an email to Elden to ask him to contact the EDI team at VSP to get this resolved.

-Bob-

Aborted Module Name: AREGDPTR_UG.AREGS607_01

Date: Day: Time: Resolution:

05/28/11 Tue 00:08 See follow up below.

Error log and follow up comments:

ORA-20100: ::An exact data code already exists for this person.::

ORA-06512: at "BANINST1.CSUG_API_GP_DTCD_RULES", line 2622

ORA-06512: at "BANINST1.CSUG_API_GP_DTCD", line 1256

ORA-06512: at line 418

AREGDPTR_UG.AREGS607_01 is complete.

David.

Looks like the AREGDPTR_UG & GR finished. YEA!!!!

Wondered if I could go to mainsite and pick up the labels this afternoon instead of waiting until tomorrow’s delivery? Thank you for getting this resolved. J

Denise.

The diploma transcript process failed Friday night. The problem seems to be a duplicate 8021/DIPLTRNPRT data code for student 824511472 (pidm is 10630196). Could you please delete one of these rows and then we can restart the job?

Rob.

Aborted Module Name: HRMSKFS_QPH.HRMSS175_01

Date: Day: Time: Resolution:

06/01/11 Wed 13:59 See follow up below.

Error log and follow up comments:

# open : Open Host (ftp.dbman.com)

#******************************************************************************

# FATAL : Cannot connect to (ftp.dbman.com): Net::FTP: connect: A system call received a parameter that is not valid.

#------------------------------------------------------------------------------

# USAGE: /appworx/csu/exec/FTP_ENHANCED.PL \

# remote_host=override_host_name\

# transfer_mode=transfer_command\

# translate=translation_mode\

# src_file=fully_qualified_source_file\

# dst_file=fully_qualified_destination_file\

# site_options=comma_delimited_site_options\

# local_options=semicolon_delimited_local_options\

# transfer_mode values

# append, dir, get, put, recv, send, submit

# translate values:

# ascii, binary, ebcdic

# site_options values

# comma delimited site options

# RECFM=FBA,LRECL=133,BLKSIZE=3325

# local_options values

# semicolon delimited local options

# active | passive | cd=remote_dir

# Also, these environment variable must be set

# net_connect, db_login, db_password

#------------------------------------------------------------------------------

I think you've got the aborted module name wrong in this email - shouldn’t it be HRMSDIRC_VT.FTP_AIS01_DIR_01? Elden isn't here today, but maybe the HR Team can check with Juliana Hissrich to see if anything has changed with the vendor and/or if their server is available?

If the HR Team hasn't already contacted Juliana, then I think we need to, at a minimum, send an email to hrsao_printed_directory@mail.colostate.edu to let them know that the ftp of the printed directory test file to vendor failed.

If HR Team has already contacted hrsao_printed_directory@mail.colostate.edu about this problem, then we'll just wait to hear back from Juliana and/or vendor.

Janice.

Does the log give a reason why the ftp failed?

Steve.

The earlier email traffic includes pretty much all we see in the joblog - don't think we're even getting connected. If you go back to the original email that Robin sent, it would have the Appman joblog attached.

Janice.

Aborted Module Name: HRMSS230.SSH_SFTP_01

Date: Day: Time: Resolution:

06/01/11 Wed 22:37 Deleted by ITS.

Error log and follow up comments:

# > Remote working directory: /g0021702

# > sftp> lpwd

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/ftp/to/user/HRMSS230.VSP.DAT

# > -rw-rw---- 1 appworx Gftp 682340 Jun 01 22:31 /ais01/ftp/to/user/HRMSS230.VSP.DAT

# > sftp> -ls -l /prod/g0021702

This is the same problem that we've been having for the previous two times that this job ran - need to contact vendor to find out why permissions have changed on their end so that our process can send the file, but cannot perform an "ls" to verify existence of file.

Janice.

Please delete. Yes, this is the same old problem (permissions) that will hopefully get resolved soon.

-Bob-

I've altered the logic in our global COMPLETION) script, /appworx/exec/COMPLETION, to allow for an exceptions file when scanning *FTP* output listings. We generally consider "Permission denied" a fatal error, but I've placed the following into the newly created "allowed exceptions" file, /ais01/dat/misc/prod/errstrg_ftp_exceptions:

Couldn't stat remote file: Permission denied

If anyone thinks the above "exception" message will permit an unacceptable error to slip through undetected, then we'll need to discuss. Otherwise, this should solve the problem we've been having with HRMSS230.SSH_SFTP_01, which has been successfully transmitting the file, but failing when the COMPLETION script detected "Permission denied" within the output. HRMSS230_VSP is next scheduled to run this Friday (6/10).

Just to test out the new logic in the COMPLETION script, I restarted the aborted HRMSDIRC_VT.FTP_AIS01_DIR_01 component, which of course failed. I wasn't expecting this restart to ftp successfully - it was just a convenient way to be sure there were no syntax errors in the *FTP* specific logic within the COMPLETION script.

Janice.

Aborted Module Name: FAIDDLDR_EV.GLBDATA_04

Date: Day: Time: Resolution:

06/04/11 Sat 23:11 Restarted by David.

Error log and follow up comments:

SUNGARD HIGHER EDUCATION

POPULATION SELECTION EXTRACT

CONTROL REPORT PAGE 1

Start Time: 04-JUN-2011 00:23:13

GLBDATA Version: 8.3.0.5

Selection ID 1: FAIDDLDR_EV_EXIT_SAT

Application: FINAID

Creator ID: FAUSER

*ERROR* Dynamic parm FAID_EV_AID_YEAR not found or is null

SQLCODE = 1403

SQL ERROR = ORA-01403: no data found

X01 ROLLBACK SQLCODE=0000

X01 COMMIT (1) SQLCODE=0000

SQLCODE = 0000

ORA-01403: no data found

DQY-ABORT ROLLBACK SQLCODE = 0000

FAIDDLDR_EV is complete

David.

The down side to this is that it was so late on Sunday that FAID schedule was still running when Sunday night oracle recycle of Banner occurred.

This happened part way through FAIDDISB_FA RPEDISB program – causing it to fail.

Tom, Is there any problem just doing a restart on this failed component – do we need to worry about “fixing” any data before we restart due to the failure part way through processing?

Janice.

No problem restarting when it's a GLBDATA run.

Tom.

Uh.. that confused me – it failed in the RPEDISB program, not GLBDATA?

Janice.

Got it. I was lost in the subject line. There's no problem restarting RPEDISB, either.

Tom.

Oh.. sorry about that, it was “connected” to the earlier fix of FAIDDLDR_EV.GLBDATA_04 because that was restarted so late on Sunday that FAIDDISB ran into the Sunday night database recycle. I’ve restarted FAIDDISB_FA.RPEDISB_01 – hopefully we can finish the FAID schedule soon!

Janice.

Aborted Module Name: FAIDSNTD.WAIT_FOR_CHAINS_01

Date: Day: Time: Resolution:

08/23/11 Tue 08:20 See follow up from Janice below.

10/26/13 Sat 06:10 Restarted by Joleen.

Error log and follow up comments:

1> /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat

+ [[ -s

+ /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat

+ ]] print *** ERROR: NO CHAIN MODULES FOUND FOR CHAIN

*** ERROR: NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

6800769.01 FAID TDCLIENT_SEND 08/23 08:22 ABORTED AWPROD APPWORX

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

Notice that this indicates that a TDCLIENT_SEND spawned module failed. If you look at that failure, you'll see that it redirects output to the /ais01/dat/work/prod/TDCLIENT_SEND.CRPG11IN.TXT file, which contains the following error message:

WARNING: Failed to connect to server

Error connecting to network SAIGPORTAL

(234) FTP connection attempt failed.

SSL Handshake failed:

I suggest restarting the failed TDCLIENT_SEND component to see if it can connect to SAIG, then restart the failed FAIDSNTD.WAIT_FOR_CHAINS_01. The FAIDSNTD is waiting for all the spawned TDCLIENT_SEND modules from its predecessor chains to have completed. The output reports from all these spawned TDCLIENT_SEND modules are then captured to the FAIDSNTD VistaPlus report.

Let me know if you have questions.

Janice.

10/26/2013 10:24 JWEARNE

FAIDSNTD.WAIT_FOR_CHAINS_01 11752556 10-26-2013 06:10:43

MDT 202 ABORTED

Only FAIDAM99 was left to run so I restarted FAIDSNTD and it has finished running.

*** ERROR: NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

11753072.00 FAID FAIDPLOD_PELL_ORIG_D10/26 06:09 FINISHED AWPROD

+ exit 1

+ err=1

The run flags for FAIDPLOD are N for both schedules.

Here is the wait for chains file:

BROWSE --- /ais01/dat/apmx/prod/TDCLIENT_SEND_SPAWNED_FROM_PROCESS_FLOWS

********************************* TOP OF DATA FAIDCORR FAIDDLPL FAIDDLST FAIDPLOD FAIDTMEX

******************************** BOTTOM OF DATA

All FAID jobs were done running. I restarted FAIDSNTD and it completed.

Joleen.

Aborted Module Name: HRMSDEM_SAL.HRMSR060_01

Date: Day: Time: Resolution:

08/25/11 Thu 12:55 see follow up below.

Error log and follow up comments:

Enter Password:

REP-0069: Internal error

REP-57054: In-process job terminated:Finished successfully but output is voided

Report Builder: Release 10.1.2.2.0 - Production on Wed Aug 24 16:57:27 2011

Delete it. If we need to we can run it later through the application since there are no Vista plus implications.

Steve H.

I deleted the HRMSDEM_SAL.NOTIFY_FOR_APWX_01 component from backlog, deleted the #AW99_6808259 subvar, then deleted the aborted HRMSDEM_SAL.HRMSR060_01 component, thereby allowing the CHAIN_FINISH component to run. HRMSDEM_DEMAND_DEPOSIT_ADVICES has completed.

By the way, the reason I deleted the #AW99_6808259 chain-specific subvar (which had the value of HRMSSAL4) was so that the HRMSDEM_SAL.CHAIN_FINISH_01 component would not write an HRMSSAL4_CHAIN_FINISH_HRMSDEM_SAL... entry in /ais01/dat/apwx/prod. We've already allowed HRMSSAL4.CHAIN_SUMMARY to proceed and pick up all the related HRMSSAL4_CHAIN_FINISH_* entries for the Salary Phase 4 feedback email -- don't want to create one now for this chain, which could potentially be picked up in the feedback summary the next time Salary Phase 4 runs.

Janice

Aborted Module Name: DOITHRS1.FTPS_CURL_01

Date: Day: Time: Resolution:

07/07/11 Thu 17:01 Restarted by Janice.

Error log and follow up comments:

DOITHRS1.FTPS_CURL_01 ABORTED last night with the error shown below. There must have been a connectivitiy problem at GGCC at that time - I restarted this component and it finished successfully this morning.

# > * SSL read: error:00000000:lib(0):func(0):reason(0), errno 73 # > * FTP response reading failed # > * Closing connection #0 # > # > curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 73 # > (56) #==============================================================================

# FATAL : Command failed with code : 56

Janice.

Aborted Module Name: HRMSCPR_HRL.HRMS_SPAWN_LOG_01

Date: Day: Time: Resolution:

07/08/11 Fri 16:14 See follow up below.

Error log and follow up comments:

+ 1>> /ais01/dat/work/prod/HRMSCPR_HRL.HRMS_SPAWN_LOG_01.Spawned_Log

+ read this_spawned_req

+ grep C

+ cut -f2 -d ?

+ print 6789789?E

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

***

*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION

***

+ exit 100

This is similar to the situation which occurred in the June Salary Phase 2. HRMSCPR_HRL.HRMS_SPAWN_LOG_01 aborted because it found a concurrent request, spawned by HRMSS064, which is in Error status - i.e. a spawned request which did **not** successfully complete.

In this case, HRL_15-JUL-2011 (Hourly Employee Pre-gen Distribution Lines) request id 6789787 spawned PSP: Import Pre-Generated Distribution Lines (REQUEST ID 6789789)and the Status of the spawned 6789789 is "Error" - see error from 6789789 below.

The program failed with the following error(s) :

This batch has errors. To see the error messages please use Pre-gen distribution lines form.

Janice.

I fixed the first fail and pre-gen has finished but it did not go to the balance report?

Vickie.

The first failure caused the HRMSS064 (Hourly Employee Pre-gen Distribution Lines) program to fail. After you fixed that, HRMSS064 was restarted and did finish successfully. However, there's more to the equation than HRMSS064 completing successfully, as this program submits other concurrent requests to perform some of the "work"... and it was one of these spawned concurrent requests that failed as noted in the earlier email. Viewing request id 6789789 within HR may provide more information to you regarding the reason for the failure and how to proceed. The Phase 2 process will not move forward with the next step (Payroll Balance Report) until the aforementioned problem has been resolved.

Janice.

Aborted Module Name: HRMSSQWL.SQWL-LOOP_01

Date: Day: Time: Resolution:

07/12/11 Tue 09:02 See follow up below.

Error log and follow up comments:

Value exceeded allowable range (line 273 of COSQWL_SUPPLEMENTAL)

Cause: Caused by Oracle error 6502 occurring during the execution of the

formula which is raised when an arithmetic conversion error, or string tr

+ awrun SEND_MAIL -v HRMSSQWL_SEND_MAIL_CO -arg

+ jobprd@mailer.is.colostate.edu

+ /ais01/dat/misc/mailst/SEND_MAIL.HRMS.APMX.ALERT.LST

+ /ais01/dat/misc/mailst/SEND_MAIL.HRSAO_SQWLARCH_FOLLOWUP.LST

+ /ais01/dat/misc/mailst/SEND_MAIL.HRMS.APMX.ALERT.LST _NULL_

+ PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado _NULL_

+ /ais01/dat/misc/mailst/SEND_MAIL_TEMPLATE.SQWL_FAILURE.TXT

+ SQWLARCH_(SPAWNED_FROM_PYUSSQLWLGRE(ID=6792213) Colorado 6792214

+ /oraapps/hrprod/log/l6792214.req

JOBID: 6577943

+ exit 1

+ err=1

The sqwl stuff directly sends a follow-up email to the users (HRSAO SQWLArch Followup), as well as to the Alert HRMS WHRS and Alert APMX lists. Please refer to the earlier mail, with subject PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado, dated Tue 7/12/2011 9:58 AM. Because we have this automated feedback reporting for the SQWL failures, there is no need to also send the normal HRMS ABORT followup email.

Janice.

Aborted Module Name: AREGDYTR.CONVERT_PDFTOPS_01

Date: Day: Time: Resolution:

07/13/11 Wed 07:00 See follow up below.

Error log and follow up comments:

There is no output file. The before condition says to check for file below and if it does not exist to abort the task every time it is true.

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

After talking to Vicki, Steve and I followed the instructions Janice outlined below from the abort book (see page 571 or note from Janice below).

We deleted the failed component and the subsequent Spool Filter. Which allowed the rest of the chain to run. Unfortunately, that meant that the transcripts that were expected were not generated. In order to print those, we changed the subvar #AREGORTR_REPRINT_DATE to today’s date and ran AREGORTR_ONREQ_TRANSCRIPT. There is a note in the chain about it:

COMMENTS:

NOTE: This chain must have a correct date updated - #AREGORTR_REPRINT_DATE

*** Oracle Report: AREGR600

Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGDYTR_DAILY_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

req_levl=AL

p_source=B

PDFEMBED=YES

*** Report Errors:

REP-0177: Error while running in remote server

Unable to retrieve a string from the Report Builder message file.

REP--002:

Janice.

Aborted Module Name: AROSDBIO.AROSS141_01

Date: Day: Time: Resolution:

07/21/11 Thu 18:27 Restarted by Joleen.

04/03/12 Tue 18:06 Restarted by Joleen.

Error log and follow up comments:

07/21/11.

ERROR at line 4:

ORA-01847: day of month must be between 1 and last day of month

ORA-06512: at line 341

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

In order to fix this abort. Rob figured out which record was causing the problem. It turns out that the birthdate requires dd/mm/yyyy. The record in question just had a 5 instead of an 05. The records in this module are created by students filling out a form online. AR doesn't have the option to modify the records and had to delete the problem record. AR had to manually re enter the record for the student. I restarted the module and it finished.

Joleen.

04/03/12.

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: ::E-mail addresses must have a

least 1 character in front of "@" and at least 1 character

ORA-06512: at line 104

Here’s what /orautl/BANPROD/AROSDBIO.AROSS141_01.utl_file1 says:

Insert failed: 829854495 Phipps -20100 ORA-20100: ::E-mail addresses must have a least 1 character in front of "@" and at least 1 character

Steve G.

Thanks for finding the problem record. Janet was able to edit the record and correct the email address. Can you start our schedule up again?

Steven.

Aborted Module Name: HRMSKFS_SAL.HRMSS175_01

Date: Day: Time: Resolution:

07/22/11 Fri 15:13 See follow up below.

04/29/14 Tue 14:24 Restarted by Robin.

Error log and follow up comments:

07/22/11.

5:13:54 1630 If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

15:13:54 1631 DBMS_OUTPUT.PUT_LINE('###################################################');

15:13:54 1632 DBMS_OUTPUT.PUT_LINE('##### #####');

15:13:54 1633 DBMS_OUTPUT.PUT_LINE('#### ####');

15:13:54 1634 DBMS_OUTPUT.PUT_LINE('### KFS FILE IS OUT OF BALANCE ###');

15:13:54 1635 DBMS_OUTPUT.PUT_LINE('#### ####');

15:13:54 1636 DBMS_OUTPUT.PUT_LINE('##### #####');

15:13:54 1637 DBMS_OUTPUT.PUT_LINE('###################################################');

15:13:54 1638 -- RAISE kfs_not_balanced;

15:13:54 1639 End if;

The out of balance condition is going to take some time to research. The plan is to complete salary phase 4 processing on Monday. I am not sure what implications this has for encumbrances. Does this mean we need to suspend encumbrances until after phase 4 has completed?

Steve.

I just cleared up the stall in tonight's HRMS schedule - excerpt from News file follows:

I saw Steve Hill's email earlier this evening that the Salary Phase 4 would have to wait until Monday.

In light of that, we don't want to run encumbrances tonight anyway.

I'm not sure what happened to the safeguard that we are supposed to have in place to make sure that HRMSAW15 runs before staff leave for the day. We had this situation (last month, I think)... HRMSAW15 was waiting for a notify file from Salary Phase4, which of course isn't going to be created tonight! HRMSENCD was waiting for HRMSAW15 - and we would have had a stalled HRMS schedule due to this failure to make sure that HRMSAW15 "stall" was handled during working hours.

I deleted the waiting HRMSAW15.WAIT_FOR_APWX_SAL4 component to clear out the stall! HRMSENCD chain has been automatically cancelled (due to Salary running, but Phase 4 not done) - so think we are on track now for HRMS schedule to proceed for tonight.

Janice.

04/29/14.

*** Follow-up Required -- Review output included below:

<<include=>>

HR test process flow run in AppMan AWTEST did not have a *** AWTEST *** prefix in email subject....mail list needs to be checked too, We will add this to our SQL updates list.

How to use SQL update statement to avoid above :)

Gudrun.

Aborted Module Name: KFSXFPDV.KFSX_JAVA_01

Date: Day: Time: Resolution:

08/05/11 Fri 00:02 Restarted by Joleen.

Error log and follow up comments:

ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf

/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

+ + ls /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

this_pdf=/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf

/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

+ cp /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf /ais01/spool/vplus/out/KFSXFPDV.KFSX_JAVA_01.6660543.pdf

cp: 0653-437 /ais01/spool/vplus/out/KFSXFPDV.KFSX_JAVA_01.6660543.pdf is not a directory.

+ exit 1

I discussed this earlier today with Josh, but just to bring everyone else up to speed….here’s what happened:

The KFSX_JAVA global script expects only one output pdf file to be created from a java program. Actually, we have very few java programs which still create pdf output files – most were converted (by the foundation) from .pdf output to .txt output. The output files are date-time stamped, so we have logic with the global script to “find” the pdf file – thinking that logic will return just **ONE** pdf filename. The subsequent logic doesn’t work well when more than **ONE** pdf filename was returned – i.e. it caused this error when it tried to execute the cp (copy command) and provided too many parameters for this command.

To avoid this problem in the future, Facilities will need to send only one .xml file to be processed via the nightly KFSXFPDV_FP_DV_BATCHBFS_UPLOAD chain. If a backlog of .xml file(s) needs to be processed, we could possibly run this chain multiple times during the daytime to catch up. Please share with Facilities staff this restriction – otherwise, this error will continue to occur and **NOTHING** will get captured to VistaPlus from the loadDisbursementVouchersStep java program. The KFSXFPDV chain does have the CHAIN_CANCEL feature, so it will not cause a stall within the KFSX schedule.

I have manually copied the two pdf output reports from last night’s KFSXFPDV to the /ais01/spool/vplus/out directory (with unique names) and also reconstructed the “chain vplus” output files within the /ais01/spool/vplus/temp/KFSXFPDV_tempdir/ directory. I also manually created entries in the /ais01/spool/vplus/out/KFSXFPDV_VPLUS_DRIVER for capturing the two pdf reports. This manual activity was intermingled with running a subset of the KFSXFPDV chain --- with manual creation of files occurring after the CHAIN_INIT component, but before the CHAIN_VPLUS component. All reports have now been successfully captured to VistaPlus. BTW- this is a somewhat painful process to manually create these files, so please steer Facilities into the only **ONE** feeder file per night routine!

Janice.

This job will runs nightly around 10 pm, so we won’t skip any files. The abort was related to putting the job output into our VistaPlus system. I don’t believe that anyone is checking VistaPlus for the summary report of the DVs uploaded (please correct me if I am wrong). So worst case scenario, the DVs load, but VistaPlus does not have the updated job log/summary report. We’ll keep an eye on this to determine if we need an alternate solution.

John.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

08/11/11 Thu 06:03 See follow up below.

Error log and follow up comments:

at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java:679)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:75)

at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

2011-08-11 06:10:11,721 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXAPEI.electronicInvoiceExtractStep.6743617.6743620.00 steps: [electronicInvoiceExtractStep]

2011-08-11 06:10:11,721 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

RunBatch ERROR: Exception found:

Check KFSXAPEI log as Appman log above does not pinpoint where the error is

cd /ais02/log

ls –al KFSXAPEI*

Select the aborted log (should be the latest one out there).

2011-08-11 11:08:42,892 [main] INFO org.kuali.kfs.module.purap.service.impl.ElectronicInvoiceHelperServiceImpl ::

Saving Invoice Reject for DUNS '606788404'

2011-08-11 11:08:42,892 [main] INFO org.kuali.rice.kns.document.DocumentBase :: invoking rules engine on document

1453289

2011-08-11 11:08:42,908 [main] INFO org.kuali.rice.kns.document.DocumentBase :: [document.invoiceRejectItems[0].i

nvoiceItemCatalogNumber] error.format.org.kuali.rice.kns.datadictionary.validation.charlevel.AnyCharacterValidatio

nPattern(Invoice Catalog Number (Catalog Number))

2011-08-11 11:08:51,968 [main] ERROR org.kuali.kfs.sys.batch.Job :: Exception occured executing step

org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

at org.kuali.rice.kns.document.DocumentBase.validateBusinessRules(DocumentBase.java:581)

at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java

:679)

This shows the failed document 1453289 where KFS technical staff went to and discovered a space character and then corrected the file.

We restarted the ABORTED KFSXAPEI.KFSX_JAVA_01 step and the chain finished successfully.

Dermot.

I don't know what business rule failed but I think I know the eInvoice file that cause the job to stop:

/ais02/app/kfs/prd/work/staging/purap/electronicInvoice/606788404omax1_32681AUG1011_1313042073898565326.xml

is the offending file that cause the job to abort. If we move that file out of there we can then rerun the job for the rest of the files while I try and figure what is wrong with this single eInvoice file.

Matt.

Aborted Module Name: AGENAM99.WAIT_FOR_CHAINS_01

Date: Day: Time: Resolution:

08/13/11 Sat 03:40 See follow up below.

Error log and follow up comments:

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

Could not open server pipe.

I found past references of this error in our "Abort log" document that indicated we restarted the component. I did this, and now AGENAM99.WAIT_FOR_CHAINS_01 is running and waiting for AGENDYGN_DAILY_GENERAL, which is being held up by the abort within HRMSSERP_UPDT_ELEMENT_MEDICARE.

Steve.

Aborted Module Name: AROSFRQ1.AROS-PYMTS-LOOP_01

Date: Day: Time: Resolution:

08/13/11 Sat 10:15 See follow up below.

Error log and follow up comments:

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

Could not open server pipe.

Steve.

Aborted Module Name: AREGDYCR.AREGS304_01

Date: Day: Time: Resolution:

08/16/11 Tue 05:11 See follow up below.

Error log and follow up comments:

05:11:24 SQL> start AREGS304

SP2-0310: unable to open file "AREGS304.sql"

05:11:24 SQL> 05:11:24 SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options

The sql is not in the directory:

BROWSE --- /ais01/src/sql/prod/* ------------------------------ INVALID COMMAND

COMMAND ===> SCROLL ===> CSR

NAME SIZE DATE TIME ATTR

AREGS303.sql 21512 11/04/04 10:29

AREGS303.sql.prev1 21334 11/02/02 08:38

AREGS303.sql.prev2 21261 11/01/25 11:31

AREGS400.sql 13002 07/04/03 13:22

Since this is a new sql, could we please make it follow standards before we place it into production?

Just a quick review reveals the following items which need changing:

1) The obsolete format of the modlog block should not be included:

--* DATE INIT SSR # REASON FOR THE CHANGE *

--* ---------- ---- ----- -------------------------------------------- *

--* mm/dd/ccyy--xx--Tnnnnn--Description-last line of modlog-don't delet

2) Comments that are not relevant to this sql should not be included:

--* Note: *

--* Portions of code are commented out with commenting being *

--* removed by sed command depending on whether the run is for *

--* current fiscal year or staffing (future FY). *

--* --*cur_fy is removed for current FY run *

--*--------------------------------------------------------------------*

3) set verify on

Should be included in the group of "set" statements at the beginning of the sql

4) The standard exit statement and comment block should be included at the end of the sql:

exit;

--*--------------------------------------------------------------------*

--* *

--* END OF AREGS304.SQL *

--* *

--*--------------------------------------------------------------------*

Please make these corrections ASAP so that the "standards compliant" version can be run when restarting the failed component.

As a reminder, there are sample sqls in /ais01/src/sql/updt on Kebler, demonstrating the sql standards:

/ais01/src/sql/updt/sql_appworx_plsql_sample

/ais01/src/sql/updt/sql_appworx_spool_sample

Oh.. one more thing. Does the title reflect the author of this sql?

Janice.

Aborted Module Name: HRMSS228.SSH_SFTP_01

Date: Day: Time: Resolution:

08/19/11 Fri 21:44 Deleted by Robin.

Error log and follow up comments:

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT

# > -rw-rw---- 1 appworx Gftp 13585 Aug 19 21:43 /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT

# > sftp> -ls -l /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls" not found # > sftp> put /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > Uploading /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT to /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > sftp> ls -l /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

19 21:45:22-Parent: (2)Checking child process(2564492)

19 21:45:22-Parent: Child process[2564492] found

19 21:45:22-Parent: Checking child mem

19 21:45:22-Parent: Value in mem [N]

19 21:45:22-Looking for [/appworx/run/kill.6788301.00]

19 21:45:22-No Kill File found('/appworx/run/kill.6788301.00').

19 21:45:22-Parent: sleeping for 10 seconds.

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls" not found # > (1) #==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

# RETURN CODE = 100

I checked on the vendor web site and the destination file

/DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

does not exist. This can happen if the vendor picks up the file and deletes it before we get a chance to confirm it is in place. We probably will want to confirm with the vendor whether or not they received the file (13585 bytes).

* If not, then we should be able to restart the SFTP component.

* If they did, then we may need to change an option for the SFTP to not fail if the file is not found after we upload it.

Elden.

I have sent an email to Leanne at Hartford asking her to verify that they got the file. I'll keep everyone posted with her response.

Leanne reporting back that they did receive the file and all looks good.

-Bob-

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

08/22/11 Mon 14:02 See follow up below.

Error log and follow up comments:

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

Caused by:

Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

+ print *** \n*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT \n***

DV 1466687 is the bad DV. The attached update sql will remove the bad character. The bad character was in position 193. So Josh, looking at the first 90 chars missed this.

Shawn will need to run the update, then Dermot can run the job.

This can wait till morning if necessary.

Thanks for everyone’s help.

John. W.

Attachment below:

set dv_chk_stub_txt =

'Dr. Browning Participant Support 8.22.11 -Dr. Browning needs to compensate 2 subjects that had to drop out of the Reebok Research study due to injury and illness -Compensation for Dr. Brownings research participants. To be paid in cash to research participants, NOT income to Dr. Browning. PAY BY CHECK. Log sheet to be returned to A/P upon completion of Disbursement Voucher. Questions? Please contact Dr. Browning at 491-5868.'

where fdoc_nbr = '1466687'

For future reference, any aborts within the KFSXPD_DY_PDP_DAILY_CHECK_ACH chain or its sub-chains (such as this one in KFSXFPPD_FP_DV_PREDISB_EXTR) should be resolved during working hours – and not be allowed to carry over to the next working day.

The BFS users expect for the morning ach/check cycle to run, starting at 7:00 A.M. – today’s chain is currently in SELF-WAIT status, waiting for yesterday’s chains to complete. It does not make sense to complete yesterday’s chain now, but that (unless someone caught it, the WAIT_FOR_TIME_03 component (after the aborted job) in yesterday’s chain would wait until 2:30 – assuming we fixed the aborted job and let the chain continue. That would mean that today’s chain would also not start running until sometime after 2:30 P.M.

To clear out yesterday’s chain and allow today’s chain to proceed, I deleted the following from backlog within yesterday’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain:

FSXPDCH_PDP_CHECKS_EXTR2 sub-chain

KFSXPD_DY.WAIT_FOR_TIME_03 component

KFSXPDDR_PDP_DAILY_RPT2 sub-chain

And the aborted KFSXFPPD.KFSX_JAVA_01 component

This will allow the CHAIN_FINISH of the KFSXFPPD_FP_DV_PREDISB_EXTR sub-chain to complete, as well as the KFSXPD_DY.CHAIN_SUMMARY_01 (to provided daily ach/check summary from yesterday) and the KFSXPD_DY_CHAIN_FINISH_01. Of course, this will also allow today’s “SELF-WAIT” KFSXPD_DY_PDP_DAILY_CHECK_ACH chain to proceed. I suspect that we will see the same error in today’s KFSXFPPD.KFSX_JAVA_01 component, as it appears to be data related. It appears that it was restarted several times yesterday, with the same results (error) – so unless the data has changed, it will likely fail on the same error in today’s KFSXFPPD.KFSX_JAVA_01 component. Should that be the case, it is important that the BFS users/KFSX Team work together to fix the error as soon as possible to allow this morning’s ach/check cycle to proceed.

Janice.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

08/23/11 Tue 07:24 Restarted by Dermot.

Error log and follow up comments:

(PMT_NTE_ID,CUST_NTE_LN_NBR,CUST_NTE_TXT,LST_UPDT_TS,VER_NBR,PMT_DTL_ID,OBJ_ID) VALUES (?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.kfs.pdp.businessobject.PaymentNoteText'

* PK of the target object is [id=10334944]

* Source object: paymentNoteText(id)=(10334944)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

This error will need to be fixed as soon as possible in order to minimize delay for this morning’s ach/check cycle.

Janice.

Last night, John found a solution for this problem.

As soon as a DBA applies the fix to the data, we will be ready to release it.

Can we just kill last night’s run and let today’s pick it up?

Last night we were under the impression that letting it go until this morning would not have a negative impact.

I have modified the logic to find special characters to search for the entire string rather than the first 90 characters.

I have added it to the resolution spreadsheet.

Josh.

See my earlier email (from around 7:15 this morning) – I already did the clean-up of yesterday’s stalled KFSXPD_DY_PDP_DAILY_CHECK_ACH by deleting the failed component and some of the downstream components. If a problem such as this cannot be fixed in a timely manner during working hours, then the appropriate course of action would be to delete downstream components… allowing only the CHAIN_SUMMARY and CHAIN_FINISH components to run, thereby completing that day’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain. It doesn’t make sense to wait until the next day to finish out the previous day’s check cycle because, for one, the users expect an ach/check cycle to run in the morning – not just the leftover “checks only” cycle from previous afternoon. Additionally, taking the time to complete previous day’s KFSXPD_ DY_PDP_DAILY_CHECK_ACH chain would delay current day’s KFSXDY_PDP_DAILY_CHECK_ACH chain – perhaps until after 2:30 P.M. if the various WAIT_FOR_TIME components within the previous day’s chain are not forced to run “before their time”. But more importantly, it is just cleaner to start over with the new day – it contains the ach/check cycle so it just makes more sense to start over with that cycle and stay on track, timing-wise. The only danger would be if the BFS users had actually performed a FORMAT CHECKS yesterday (but I checked and no check xml files are present in /ais02/app/kfs/prd/work/staging/pdp/paymentExtract). They usually don’t perform FORMAT CHECKS until after receiving the A.M./P.M. KFSXPDDR report which lets them know what’s available for payment – and we never got to that subchain of the P.M. cycle yesterday.

If the problem is data related and/or requires a dba fix, then it will surface again in the next day’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain (the morning cycle) – at which time the fix can hopefully be implemented quickly enough so that the users morning ach/check cycle will not be delayed too long.

I suggest that you place the KFSXPD_DY.WAIT_FOR_TIME_01 component on hold and have someone manually verify with BFS that they have had time to perform the manual ‘FORMAT CHECKS’ process. They usually would have already received the report from KFSXPDDR (around 7:30)… so if that is delayed too long, they may not have time to review the report and perform ‘FORMAT CHECKS’ process prior to the 09:00 A.M. start time for the KFSXPD_DY.WAIT_FOR_TIME_01 component. They may not even have any ach/checks for today – that does happen sometimes. In that case, the timing is not such a big deal.

Once the KFSXPDDR sub-chain completes **AND** you’ve received confirmation from BFS that the FORMAT CHECKS has been completed (or they aren’t planning to do one this morning), then you may release the hold on the WAIT_FOR_TIME_01 component. Do you know who will be contacting BFS regarding this?

I just checked and there is a check xml file in /ais02/app/kfs/prd/work/staging/pdp/paymentExtract, which was created 11/08/23 08:58. I’d say we are “good to go” and could release the hold on WAIT_FOR_TIME_01 component.

Janice.

Aborted Module Name: DOITDEMO_01.FTPS_CURL_01

Date: Day: Time: Resolution:

08/31/12 Fri 20:49 Restarted by Steve.

12/01/12 Fri 11:37 See note from David below.

03/30/13 Sat 14:44 Restarted by Steve.

Error log and follow up comments:

08/31/12.

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# 2011.08.31-21:06:50 : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

error is 100

The log shows that the file was opened and couldn't be written to:

# > < 125-FTP Server unable to obtain EXCLUSIVE use of G.F.CSU.DOWN1 which is held by: 0093 DDWRKFRC EXCL on SYSDSN

# > < 125 Data set G.F.CSU.DOWN1 is not available

This probably can be tried again since no data was written, according to the log. Please check with the process flow notes to verify if this can be restarted.

When it is restarted, if it fails again, then someone will have to follow-up with the state as to why the file is locked on their server.

Elden.

12/01/2012 11:37 DEPETERS

Noticed that DOITDEMO_01.FTPS_CURL_01 had failed. It could not find the source file. It looks like when the condition ran to copy the file it thought the utl_file was empty. I found the backup file and found that it did have data in it. Timing issue? I manually copied the backup file to the /ais01/ftp/to/user directory and restarted DOITDEMO_01.FTPS_CURL_01. It finished successfully.

03/30/2013 14:44 SSGREENE

DOITDEMO_01.FTPS_CURL_01 failed looking for file /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT

For some reason the utl file was never copied from HRMSDEMO_01.HRMSS051_01

to /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT like it should have been. I manually copied the backup of the utl file to /ais01/ftp/to/user, renamed it to DOITDEMO_01.HRMSS051_01.DAT and restarted the failed component, which finished successfully.

Aborted Module Name: FAIDSAIG_EV.TDCLIENT_01

Date: Day: Time: Resolution:

09/01/11 Thu 00:05 See follow up below.

Error log and follow up comments:

+ print APPEND=Y

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ print AUTOEXT=N

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ print receiveclass=EXITFFOP)"

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ read this_receive_tdclient

+ tdclient_out=/ais01/dat/work/prod/FAIDSAIG_EV.TDCLIENT_01.non_isir

+ tdclientc cmdfile=FAIDSAIG_EV_receive_cmdfile

+ 1> /ais01/dat/work/prod/FAIDSAIG_EV.TDCLIENT_01.non_isir 2>& 1

+ exit 107

I officially "hate" TDCLIENT!!! I had to build a temp version of the TDCLIENT.KSH to bypass the portion of the job which had already successfully completed and pick up where it left off. Why is it that if SAIG is going to not communicate with us -- it had to do that **between** collection of the ISIR and non-ISIR messages class files?? Just to make it more difficult for us!

I think we should have a follow-up Clarity incident to re-examine the TDCLIENT processing. Maybe we could separate the ISIR and non-ISIR processing into separate run types - thereby making restarts easier in situation as we faced this morning.

Cross your fingers that in the "temp" script I removed only what should have been removed!!

Janice.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

09/02/11 Fri 03:07 See follow up below.

Error log and follow up comments:

Terminated Employee: Duncan,Karen Lee 44472 klduncan INACTIVE Employee

Terminated Employee: Edler,Joshua Robert 43384 jredler INACTIVE CSU Ex-Employee

Terminated Employee: Edwards,Ryan J 28464 edwa3314 INACTIVE Employee

Terminated Employee: Eisenhauer,Scarlett Frederike 52183 seisenha INACTIVE CSU Ex-Employee

declare

ERROR at line 1:

ORA-20100: Too many Users are being terminated. Verify the HR data.

ORA-06512: at line 355

There were over 100 people that were set to be terminated in KFS today.

Since that broke our threshold, the job failed.

Josh.

We have set up a catch in KFS that if over 100 people terminate then the Kuali user update is canceled. A couple of weeks ago this job tried to terminate thousands of people so we built this safety guard.

Last night the system sent over 100 people to terminate. Can you have someone verify that this is correct?

Theresa.

Yes, that sounds reasonable. Three processes run on the first of each month to automatically terminate 3 groups of assignments:

- Those with an Appt End Date more than 3 months past

- Those with an I-9 which expired more than 3 months past

- Students and non-student hourlies who haven’t been paid in the last 18 months

If you want to send me the full list when you have it, I’m happy to do a little more checking.

Carolee.

I will have the job restarted. Do you want me to set the values to 500 everyday or just for today’s run?

Josh.

Let’s bump for every day- we are just trying to catch something overly excessive. It sounds like it will not be uncommon for it to be at least this large once a month.

Theresa.

The max termination was increased to allow 500 terminations opposed to 100.

This flag is used as a safety measure.

This process worked as expected by raising the error. Users needed to verify the data and there were no technical problems.

HR reviewed the data and was comfortable with the number of terminations.

We have permanently increased this threshold since there are months when there could be a lot of activity.

Josh.

Aborted Module Name: EIDSUPDT.HRMSS111_01

Date: Day: Time: Resolution:

09/06/11 Mon 22:39 Restarted by Joleen.

Error log and follow up comments:

+ print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS \n***

+ 1>> /ais01/dat/work/prod/EIDSUPDT.HRMSS111_01.6868218.6868225.00.2011_09_06_2239_sql_followup

+ cat /ais01/dat/work/prod/EIDSUPDT.HRMSS111_01.6868218.6868225.00.2011_09_06_2239_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

829180059 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

The problem is in the CSUH_EMAIL_UPDATE function. There are two records in the per_all_people_f table that match the CSU ID (829180059). It looks like it's two different people, but with the same name. The CSU ID (attribute 12) needs to be changed for one of them.

-Bob-

Who can/should make this change?

The person in Banner that I associate with this ID has an HR ID of H52492. Hope that makes sense.

Vicki.

Aborted Module Name: AREGORGN.AREGS002_01

Date: Day: Time: Resolution:

09/07/11 Tue 17:02 See follow up below.

Error log and follow up comments:

line=201210 BC 487A FNS2

line=201210 BC 487B FNS2

line=201210 BC 499A FNS2

line=201210 BC 499B FNS2

line=201210 BC 711A FNS2

line=201210 BC 711C FNS2

line=201210 BC 711D FNS2

line=201210 BC 711F FNS2

line=201210 BIOM 476A FEG3

line=201210 BIOM 476B FEG3

begin <<all_block>>

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 332

Hi Denise and Jerry

AREGS002 is aborting because it has more than 200 error messages. Most of the error messages look like the messages below. What do you want to do? Do you want to skip the AREGS002 module and work at completing your AREG schedule? Or do you want to do something else?

Attribute FBU3 Already Exists

201210 ACT 679A FBU3

Attribute FAG1 Already Exists

201210 AGRI 496A FAG1

Vicki.

Hi Vicki, I put that file out yesterday. Let's skip the AREGS002 and I'll look at it again, but let's get the AREG schedule completed.

Sorry for the delay, Jerry & I were in a meeting.

Denise.

The AREGS002 module has been skipped and the rest of the AREG schedule should start running.

Vicki.

Aborted Module Name: AROSFRQ1.TGRAPPL_01

Date: Day: Time: Resolution:

09/12/11 Mon 21:15 See note from Janice below.

Error log and follow up comments:

I happened to log in tonight and noticed the AROSFRQ1.TGRAPPL and AROSDPA1.TGRAPPL failures. Normally, we would not have AROSFRQ1 running same time as AROSDPA1 because we create the file to stop the AROSFRQ1 cycles before AROSDPA1 starts. However, this was an earlier AROSFRQ1.TGRAPPL which failed with resources deadlock:

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 682

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1519

ORA-06512: at line 1

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line

3,395

WRN-ERRSTMT: Following statement was last statement parsed:

begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N tgrappl terminated with error

Once a TGRAPPL fails, then any subsequent TGRAPPL's (like tonight's AROSDPA1.TGRAPPL) will fail with:

* **WARNING** *

* You cannot submit this job - it is already running. *

* *

* You will also get this message if a previous run of *

* this program aborted. If this is the case, the *

* control record for that run must be deleted before *

* proceeding. (GJBPRUN record for this jobname with *

* a -1 one-up-no).

So, since I think the AROSFRQ1.TGRAPPL didn't really get off the ground due to the resources deadlock, I'm going to delete the aforementioned GJBPRUN record and try to resubmit AROSFRQ1.TGRAPPL. If that completes okay, then I'll restart AROSDPA1.TGRAPPL.

09/12/2011 21:35 JMWILKIN

That worked -- AROSFRQ1.TGRAPPL output seems normal. I waited until the remainder of the AROSFRQ1 cycle finished, then restarted the failed AROSDPA1.TGRAPPL.

By the way, as followup, we need to figure out what "deadlocked" with the AROSFRQ1.TGRAPPL.

AROSDTRN.TGRCLOS_01 was running at the same time - would these two programs fight over resources??

Aborted Module Name: AROSDPA1.TGRAPPL_01

Date: Day: Time: Resolution:

09/12/11 Mon 21:15 See note from Janice below.

Error log and follow up comments:

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 682

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1519

ORA-06512: at line 1

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line

3,395

WRN-ERRSTMT: Following statement was last statement parsed:

begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N tgrappl terminated with error

Once a TGRAPPL fails, then any subsequent TGRAPPL's (like tonight's AROSDPA1.TGRAPPL) will fail with:

* **WARNING** *

* You cannot submit this job - it is already running. *

* *

* You will also get this message if a previous run of *

* this program aborted. If this is the case, the *

* control record for that run must be deleted before *

* proceeding. (GJBPRUN record for this jobname with *

* a -1 one-up-no).

09/12/2011 21:35 JMWILKIN

That worked -- AROSFRQ1.TGRAPPL output seems normal. I waited until the remainder of the AROSFRQ1 cycle finished, then restarted the failed AROSDPA1.TGRAPPL.

By the way, as followup, we need to figure out what "deadlocked" with the AROSFRQ1.TGRAPPL.

AROSDTRN.TGRCLOS_01 was running at the same time - would these two programs fight over resources??

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/15/11 Thu 07:04 Restarted by Dermot.

Error log and follow up comments:

2011-09-15 07:17:13,180 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

RunBatch ERROR: Exception found:

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

at java.lang.Throwable.<init>(Throwable.java:67)

We have not received this morning’s check format and our Kuali people are out this morning. Is there a problem.

Thanks,

Jackie..

Yes, the pre-disbursements extract program, disbursementVoucherPreDisbursementProcessorExtractStep, failed. The notification regarding this failure was sent early this morning to the IS Kuali Team, but so far we have not heard anything back so I’m guessing that they are still working to resolve the problem.

Of course, this program is early in the daily ach/check process and we need to solve this problem before the checks will be produced.

I’ve placed a hold on the portion of the chain which would normally take off at 09:00 because you may not receive the report by then and/or have time to do the Format Checks.

I’ve included the IS Kuali Team on this email traffic, so hopefully they can update all of us with progress on solving the problem.

Janice.

We are in the process of updating the data to correct the special characters. We should have this resolved shortly.

Josh.

Aborted Module Name: AROSSTM1.AROSS302_01

Date: Day: Time: Resolution:

09/15/11 Thu 23:20 Restarted by Joleen.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.TOKODST", line 48

ORA-00001: unique constraint (TAISMGR.TBRCCHG_INDEX_01) violated

ORA-06512: at "TAISMGR.TT_TBBACCT_INSERT_ODS_CHANGE", line 8

ORA-04088: error during execution of trigger

ORA-06512: at line 436

ORA-06512: at line 1086

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

It looks like the error was spawned from a deadlock.

Please restart the module.

Josh.

Aborted Module Name: ODSRAROS.ODSRS001_01

Date: Day: Time: Resolution:

09/15/11 Thu 23:26 Restarted by Joleen.

Error log and follow up comments:

old 6: csug_ods_refresh.log_begin_time('&REFRESH_APP');

new 6: csug_ods_refresh.log_begin_time('REFRESH_AR');

old 8: ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new 8: ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_AR', NULL, '');

old 14: csug_ods_refresh.log_end_time('&REFRESH_APP');

new 14: csug_ods_refresh.log_end_time('REFRESH_AR');

begin

ERROR at line 1:

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

Here are the errors I found in REFRESH_AR. These two mappings had errors, they did finish successfully (highlighted numbers) but the IA Admin tool had trouble verifying this and considered them failed.

We have two option, we can re-run the Refresh AR to get the correct end date/timestamp (30-40 mins) or we can continue on with whatever follows this component.

OWNER	ID	MAP	ELT	RS	SEL	INS	DEL	START_TIME	END_TIME
JOBPRD	459500	DELETE_MTT_ACCOUNT	00: 01: 08	COMPLETE	130670	0	130307	9/15/2011 23:26	9/15/2011 23:27
JOBPRD	459511	UPDATE_MTT_ACCOUNT	00: 12: 44	COMPLETE	130322	130322	0	9/15/2011 23:33	9/15/2011 23:46

- Mark

Aborted Module Name: HRMSACH_QPS.NACHA_01

Date: Day: Time: Resolution:

09/16/11 Fri 08:12 See follow up below.

Error log and follow up comments:

HRMSACH_QPS.NACHA_01 / HRMSACH_NACHA_PROCESSING is in FIN-DB ERROR.

2011-09-16 08:10:49 Prompt 22 changed from "{#HRMS_{#2}_PAYROLL_TYPE}" to "{#HRMS_QPS_PAYROLL_TYPE}" by OSU=appworx JDBC Thin Client

2011-09-16 08:10:49 Prompt 24 changed from "{#HRMS_{#3}_PAYDATE_HRFORMAT}" to "{#HRMS_QUICK_PAYDATE_HRFORMAT}" by OSU=appworx JDBC Thin Client

2011-09-16 08:10:49 Prompt 25 changed from "{#HRMS_{#3}_PAYDATE_HRFORMAT}" to "{#HRMS_QUICK_PAYDATE_HRFORMAT}" by OSU=appworx JDBC Thin Client

CON-2011-09-16 08:12:01 Set Subvar

CON-2011-09-16 08:12:04 Set Subvar

CON-2011-09-16 08:12:07 Set Subvar

CON-2011-09-16 08:12:09 Set Subvar

2011-09-16 08:12:09 Prompt 22 changed from "{#HRMS_QPS_PAYROLL_TYPE}" to "21" by OSU=appworx JDBC Thin Client

2011-09-16 08:12:09 Prompt 24 changed from "{#HRMS_QUICK_PAYDATE_HRFORMAT}" to "2011/09/15 00:00:00" by OSU=appworx JDBC Thin Client

2011-09-16 08:12:10 Prompt 25 changed from "{#HRMS_QUICK_PAYDATE_HRFORMAT}" to "2011/09/15 00:00:00" by OSU=appworx JDBC Thin Client

2011-09-16 08:13:20 java.sql.SQLException: ORA-12899: value too large for column "APPWORX"."SO_JOB_QUEUE"."SO_LOG_REVIEWED" (actual: 1792, maximum: 1)

ORA-06512: at "APPWORX.AW5", line 2464

ORA-06512: at line 1

aw5.aw_condition_action

0 jobid: IN:NUMERIC:java.math.BigDecimal:6917597

1 condition_order: IN:NUMERIC:java.math.BigDecimal:10

2 action: IN:VARCHAR2:java.lang.String:SET SUBVAR

3 performed: IN:OUT:VARCHAR2:java.lang.String:N

4 actionArg: IN:VARCHAR2:java.lang.String:#HRMSACH_QPS_PAYROLL_ACTION_ID=33226831

5 results: OUT:NUMERIC::null

6 text: OUT:VARCHAR2::null

FIN-DB ERROR(FINISHED) 2011-09-16 08:13:20

2011-09-16 08:47:07

I have reviewed the output for request 6896549 and everything seems to be in order. It processed 4 employees successfully.

Can you provide more information pertaining to the abort you are seeing?

Steve H.

An after condition had failed:

HRMSACH_QPS.NACHA_01 appman after condition variables were manually updated and the component deleted so HRMSACH_NACHA_PROCESSING could continue.

David.

Aborted Module Name: KFSXCS52.KFSXS007_01

Date: Day: Time: Resolution:

01/19/12 Thu 08:57 Restarted by Dermot.

Error log and follow up comments:

PLS-00201: identifier 'CSUF_EMPLOYEE_PRIMARY' must be declared

ORA-06550: line 119, column 28:

PL/SQL: Item ignored

ORA-06550: line 120, column 28:

PLS-00352: Unable to access another database 'KRTEST@KRUSER'

ORA-06550: line 120, column 28:

PLS-00201: identifier 'CSUF_EMPLOYEE_PRIMARY' must be declared

ORA-06550: line 120, column 28:

Was this running on production?

I am a little concerned that any production job would be referencing KRTEST.

Josh.

I saw the same thing – and yes, this is running on AWPROD. The KFSXS007 sql runs using jobprd@kfsprd login, so we’re definitely running this sql against kfsprd.

Looks like CSUF_EMPLOYEE_PRIMARY, mentioned in the error message, is a view that selects from csuf_employee_primary@kfs_to_ods

It becomes difficult to follow the tracks across links, across views and so on, but it appears that something (maybe in odsprod) is pointing to krtest?

Janice.

I have found the problem, looks like a synonym is incorrect.

I will work with a DBA to get the problem resolved and let scheduling know when we are ready to restart the module.

Josh.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

09/22/11 Thu 06:01 See follow up below.

Error log and follow up comments:

KFSXAPEI.KFSX_JAVA_01 / KFSXAPEI.KFSX_JAVA_01 is in ABORTED status.

If the error is not obvious in this outpur file then try searching for INVALID to locate the correct error below in yellow.

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

***

+ egrep ^log4j:|^WARNING:|^ERROR:|^Exception|^Caused by: /ais02/log/KFSXAPEI.KFSX_JAVA_01.6944212.6944215.00.2011_09_22_0601.log

log4j:WARN File option not set for appender [LogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:WARN File option not set for appender [MemoryLogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:ERROR No output stream or file set for the appender named [LogFile].

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

+ print *** \n*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT \n***

***

*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT

2011-09-22 06:06:10,842 [main] INFO org.kuali.kfs.module.purap.service.impl.ElectronicInvoiceOrderHolder :: Adding reject reason - Invoice Purchase Order Number is an Invalid Number (Invoice Order ID:280373 REPLACE)

To locate the log file go to /ais02/log

To locate the xml file (see below) go to /ais02/app/kfs/prd/work/staging/purap/electronicInvoice

<OrderReference orderID='280373 REPLACE' orderDate='2011-09-21'><DocumentReference payloadID='280373 REPLACE'></Do

</OrderReference>

</InvoiceDetailOrderInfo>

There should be no space between '280373 REPLACE'

John inserted a _ between '280373_ REPLACE' and job was restarted.

Dermot.

Aborted Module Name: FAIDDLDR_EV.RERIM12_04

Date: Day: Time: Resolution:

09/30/11 Fri 00:13 See follow up below.

Error log and follow up comments:

+ grep 6983812

6983812.00 BANNER FAIDDLDR_EV.RERIM12_09/30 00:16 00:00:02 ABORTED FAIDDLDR_DISB_REQUIREMENT

+ print Failure in spawned RERIM12 - abort this module

Failure in spawned RERIM12 - abort this module

+ exit 1

1 row created.

Elapsed: 00:00:00.01

'crpn12op.2011_09_29_1010.bak.xml'

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

As I was working on the KFSX job I noticed that

FAIDDLDR_EV.RERIM-LOOP_01 failed in spawned

FAIDDLDR_EV.RERIM12_04 with the following error:

Elapsed: 00:00:00.01

'crpn12op.2011_09_29_1010.bak.xml'

ERROR at line 15:

ORA-12899: value too large for

column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

I decided to remove this offending file since it looked like an exact duplicate of crpn12op.xml as shown below. I copied the file with the name too long to my directory and re-started the looper so that the Finaid schedule would not hang all night.

283 Kebler finaid% ls -l crpn*

-rw-rw---- 1 appworx Gprd 95370 Sep 30 00:06

crpn12op.2011_09_29_1010.bak.xml

-rw-rw---- 1 appworx Gprd 95370 Sep 30 00:06

crpn12op.xml

284 Kebler finaid% cp crpn12op.2011_09_29_1010.bak.xml ~dpeterso

David.

Aborted Module Name: ODSRAROS.ODSRS001_01

Date: Day: Time: Resolution:

10/17/11 Mon 23:25 Restarted by Joleen.

Error log and follow up comments:

23:25:46 11 if (error_ind <> 'N') then

23:25:46 12 raise_application_error(-20001,'ODS Refresh Failed');

23:25:46 13 end if;

23:25:46 14 csug_ods_refresh.log_end_time('&REFRESH_APP');

23:25:46 15 end;

old 6: csug_ods_refresh.log_begin_time('&REFRESH_APP');

new 6: csug_ods_refresh.log_begin_time('REFRESH_AR');

old 8: ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new 8: ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_AR', NULL, '');

old 14: csug_ods_refresh.log_end_time('&REFRESH_APP');

new 14: csug_ods_refresh.log_end_time('REFRESH_AR');

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

The error can be ignored. There is a bug in the IA Admin interface that pops up once in a while where it can’t verify if a mapping has completed, but the OWB audit log does show it is complete.

ls: 0653-341 The file /orautl/odsprod/ODSRAROS.ODSRS001_01.utl_file* does not exist.

MAP	ELT	RS	SEL	INS	DEL	START_TIME	END_TIME
UPDATE_MTT_ACCOUNT_DETAIL	00: 01: 56	COMPLETE	78525	78525	0	10/17/2011 23:45	10/17/2011 23:47
UPDATE_MTT_ACCOUNT	00: 11: 52	COMPLETE	131041	131041	0	10/17/2011 23:33	10/17/2011 23:45
DELETE_MTT_ACCOUNT_DETAIL	00: 04: 55	COMPLETE	78525	0	76266	10/17/2011 23:26	10/17/2011 23:31
DELETE_MTT_ACCOUNT	00: 00: 55	COMPLETE	131754	0	131004	10/17/2011 23:25	10/17/2011 23:26

Aborted Module Name: FAIDCFIM_FA.SWPCOFI_01

Date: Day: Time: Resolution:

11/10/11 Thu 07:01 Restarted by Joleen.

10/11/12 Thu 06:59 Restarted by Joleen.

Error log and follow up comments:

11/10/11.

ABORT: data file record number 1 contains incorrect number of fields

Import Error and Warning Legend

--------------------------------

IMP-001: File ID could not be match to SPBPERS, GOBINTL, or SWRSDET. Record not loaded.

IMP-002: Matching SWRSDET record does not exists for batch. Record not loaded

IMP-003: File birth date does not match SPBPERS_BIRTH_DATE.

IMP-004: Student name does not match SPRIDEN.

I've copied the COF file to your secure directory /userfiles/Ufaid/data/FAIDCFIM_FA.DECRYPT_01.7187999.DAT

Appears the first record has both header and data values - thus can't match.

Is this something you can have COF set the corrected file back out, or should we try to fix?

Phil.

10/11/12.

ORA-03135: connection lost contact

Process ID: 0

Session ID: 0 Serial number: 0

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus Now turn on set -x for debug purposes

+ [ -f login.11131303 ]

+ echo Could not log in to SQL*Plus.

Could not log in to SQL*Plus.

+ echo Exiting with error (return code = 5).

I checked the jobprd@banprod login. Test was successful. Regarding the job it did not get started. No processing has occurred for it. Even prompt value insertion into table gjbprun did not get completed successfully. Given no other conditions to worry about I would just reset the failed component.

Gudrun.

There were no conditions on the component. I restarted and the component has finished running.

Joleen.

Aborted Module Name: HRMSS241.SSH_SFTP_04

Date: Day: Time: Resolution:

11/23/11 Wed 20:58 See note from Janice below.

04/28/14 Mon 07:20 Restarted by Robin.

Error log and follow up comments:

11/23/11.

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l {#ENCRYPT_DEST_FILE_7258991 # > ls: 0653-341 The file {#ENCRYPT_DEST_FILE_7258991 does not exist.

# > Shell exited with status 2

# > sftp> -ls -l /HRMSS241_NEWHIRE.pgp

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/HRMSS241_NEWHIRE.pgp" not found # > sftp> put {#ENCRYPT_DEST_FILE_7258991 /HRMSS241_NEWHIRE.pgp # > File "{#ENCRYPT_DEST_FILE_7258991" not found.

# > (1)

#==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

When the SSH_SFTP_04 component was added last week, the associated prompt #1 value was missing the trailing } character.

I modified the chain definition, changing {#ENCRYPT_DEST_FILE_{chain_id} to {#ENCRYPT_DEST_FILE_{chain_id}}.

Likewise, for the aborted component, I added the ending } character, changing {#ENCRYPT_DEST_FILE_7258991 to {#ENCRYPT_DEST_FILE_7258991} and restarted the failed component. It completed successfully.

As follow-up, might be worthwhile to double-check the other chains that transfer files to HealthSmart to verify that all newly added SSH_SFTP component(s) have the source filename prompt value specified properly - {#ENCRYPT_DEST_FILE_{chain_id}}

Janice.

04/28/14.

# > secureftp.healthsmart.comPermission denied (publickey).

Looks like something has changed at Healthsmart. Who do we contact to get the new login credentials?

Steve G.

Did the subvar for the identity file get changed ? It seems to be set to a value other than expected.

Please check values against successful run 13020083 and retry.

Gudrun.

Shouldn't the idfile = "/home/jobprd/.ssh/csu_to_health_smart-4096-20111027"

The current value for the idfiles is "cstate@secureftp.healthsmart.com:/HRMSS241_NEWHIRE.pgp"

Please try to change the idfile to "/home/jobprd/.ssh/csu_to_health_smart-4096-20111027"

Also appman variable, #ssh_idfile_jobprd_healthsmart needs to be set to /home/jobprd/.ssh/csu_to_health_smart-4096-20111027

David.

I have made the corrections and HRMSS241_COBRA is now complete.

Robin.

Aborted Module Name: AREGORLA.AREGS519_01

Date: Day: Time: Resolution:

12/13/11 Tue 07:00 Restarted by Joleen.

Error log and follow up comments:

There is a missing file AREGORLA.AREGS519_01.DAT in the userfiles/Umath directory.

I’ll follow up with Lois Samer to see what she has to say.

Vicki.

Vicki and I talked. I have removed AREGORLA_ONREQ_LAST_ATTENDED from backlog. I will request this process flow back in when the user has the file available.

Joleen.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

01/24/12 Tue 14:33 See below.

Error log and follow up comments:

Java(TM) SE Runtime Environment (build pap6460sr5-20090529_04(SR5))

IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 AIX ppc64-64 jvmap6460sr5-20090519_35743 (JIT enabled, AOT enabled)

J9VM - 20090519_035743_BHdSMr

JIT - r9_20090518_2017

GC - 20090417_AA)

JCL - 20090529_01

<#/ais02/job/temp/kfsx_java_ssh.ksh.110#> java -Xms1g -Xmx1g -classpath /opt/freeware/apache-tomcat-kfsprd/webapps/kfs-prd/WEB-INF/classes:/opt/freeware/apache-tomcat-kfsprd/common/lib/*:/opt/freeware/apache-tomcat-kfsprd/webapps/kfs-prd/WEB-INF/lib/*:. edu.csu.batch.service.RunBatch disbursementVoucherPreDisbursementProcessorExtractStep KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.7529604.7529659.00

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT \n***

***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

***

+ egrep ^log4j:|^WARNING:|^ERROR:|^Exception|^Caused by: /ais02/log/KFSXFPPD.KFSX_JAVA_01.7529604.7529659.00.2012_01_24_1403.log

log4j:WARN File option not set for appender [LogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:WARN File option not set for appender [MemoryLogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:ERROR No output stream or file set for the appender named [LogFile].

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

Exception in thread "Thread-3" java.lang.OutOfMemoryError

We need to increase the memory from 1 Gb to 2Gb. Anyone know how to do that?

John.

I changed the prompt value in the ABORTED job to 2g (prompt #5) and restarted the chain.

After the chain completed I also updated prompt #5 at the chain level within KFSXFPPD so that it will in future run with the value of 2g.

Dermot.

Aborted Module Name: FAIDTKNT_EV.LYNX_01

Date: Day: Time: Resolution:

01/13/12 Fri 02:20 Restarted by Steve.

Error log and follow up comments:

01/13/12.

Looking up wsprod.colostate.edu

Making HTTP connection to wsprod.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsprod.colostate.edu

Making HTTP connection to wsprod.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Alert!: Unexpected network read error; connection aborted.

Can't Access `http://wsprod.colostate.edu/cwis231/autorun/parent_tknt_email.cfm?ay=FAIDTKNT_EV'

Alert!: Unable to access document.

lynx: Can't access startfile

I tried pinging wsprod.colostate.edu and got a response. I then reset FAIDTKNT_EV.LYNX_01 and it finished.

FAIDTKNT_TRACK_NOTIFICATION is proceeding.

Steve.

Aborted Module Name: FAIDCFIM_SP.SSH_SFTP_LIST_01

Date: Day: Time: Resolution:

01/19/12 Thu 06:59 See follow up below.

Error log and follow up comments:

# - sftp

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_infosys_prod" cofcsu@ftp.college-assist.org

# > Welcome to COFsftp> pwd

# > Remote working directory: /

# > sftp> lpwd

# > Local working directory: /ais101jfs/jobprd # > sftp> ls -1 resp_query/FAIDCFEX_SP.2012_01_19_0600.gpg.resp.gpg

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/resp_query/FAIDCFEX_SP.2012_01_19_0600.gpg.resp.gpg" not found # > (1) #==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (0)

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

It looks like the file (FAIDCFEX_SP.2012_01_19_0600.gpg) from COF this morning was empty. This caused our FAIDCFIM_SP failure. Can you check with them regarding this?

David.

Aborted Module Name: HRMSREC_SAL.SQLLOAD-LOOP_01

Date: Day: Time: Resolution:

01/19/12 Thu 10:15 See follow up below.

Error log and follow up comments:

Record 25: Rejected - Error on table "CSUH"."CSUH_CAMPUS_REC_TRANS_00", column EE_CONTRIBUTION.

ORA-01722: invalid number

Record 26: Rejected - Error on table "CSUH"."CSUH_CAMPUS_REC_TRANS_00", column EE_CONTRIBUTION.

ORA-01722: invalid number

MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run.

Table "CSUH"."CSUH_CAMPUS_REC_TRANS_00":

3 Rows successfully loaded.

100 Rows not loaded due to data errors.

0 Rows not loaded because all WHEN clauses were failed.

0 Rows not loaded because all fields were null.

pace allocated for bind array: 99072 bytes(64 rows)

Read buffer bytes: 1048576

Please restart this program. The data had "$" on the amounts so it was failing. I fixed the data so we should be good to go.

Steve H.

Aborted Module Name: EIDSUPDT.HRMSS111_01

Date: Day: Time: Resolution:

01/20/12 Fri 23:34 Restarted by Joleen..

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

821140538 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

The problem is that the CSUH_EMAIL_UPDATE function, when called with int_ref_id = 821140538, is returning multiple rows.

I just checked... In the function there is a select statement and sure enough it returns 2 rows (in HRPROD, not HRTEST). The one record has a last name of "Calhoun", while the last name on the other one is "Calhoun delete".

-Bob-

Carolee

Can you please look into this and see if you can resolve this data issue in HRPROD so that we can proceed with/finish the EIDS schedule?

Vicki.

I'm not sure what needs to be done immediately. I'm waiting for the "good" record to be approved so we can merge any eID and ARIES records. Then I'll delete the other record. Should I delete the email from the good record as a fix for today?

Carolee.

Carolee,

The only way to fix this issue now is to either delete the record, change the attribute12 field (CSU ID), or change the effective_end_date (currently 31-dec-4712').

-Bob-

I deleted the CSU ID.

Carolee.

Aborted Module Name: HRMSDED_SAL.HRMSRPTS-LOOP_02

Date: Day: Time: Resolution:

01/24/12 Tue 19:25 See follow up below.

Error log and follow up comments:

p_salary_start='21-DEC-2011'

p_salary_end='24-JAN-2012'

element1='Campus Recreation'

element2='Campus Recreation Dues'

log_file='No'

------------

Execution options

VERSION=2.03b ORIENTATION=LANDSCAPE

Current NLS_LANG and NLS_NUMERIC_CHARACTERS Environment Variables are :

American_America.US7ASCII

Enter Password:

REP-0091: Invalid value for parameter 'ELEMENT1'.

Phase 4 of Salary processing, including the email to the listserv about Salary Payroll being done is waiting for HRMSDED_SAL to complete.

The newly added HRMSR317 report failed (see Robin's email below). This was tested on AWTEST, but I notice that the HRMSR317 job definition on AWPROD has different prompt default values than on AWTEST, which I'm suspecting may be causing the failure?

On AWPROD, the following values were passed to HRMSR317:

element1='Campus Recreation'

element2='Campus Recreation Dues'

log_file='No'

but on AWTEST, the values used were:

element1='5207'

element2='5255'

log_file='N'

What values should be used on AWPROD?

Janice.

You are correct about the parameters.

Bev.

Robin,

Please make the parameter changes to the HRMSR317 job (module) definition on AWPROD and then reset the HRMSDED_SAL.HRMSRPTS-LOOP_02 component. It keeps track of which reports it has completed, so should pick up with HRMSR317, which by the way is the last report that needs to run.

Oh, and this failure is also holding up the capture of the following Salary reports to VistaPlus:

HRMSR002 HRMSR003 HRMSR005 HRMSR040 HRMSR041 HRMSR042

HRMSR043,CSU_FAC_FLEX_COMBINED,

HRMSR043,CSU_ST_FLEX_COMBINED,

HRMSR240, HRMSR315, HRMSR316,

So all of these, along with the HRMSR317, will be captured as soon as we have a successful completion on HRMSR317.

Janice.

Aborted Module Name: FAIDALEX_EV.SSH_SFTP_01

Date: Day: Time: Resolution:

01/24/12 Tue 18:02 See follow up below.

Error log and follow up comments:

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

Joleen.

I just tried manually logging into the server and it worked this time.

/usr/bin/sftp -oIdentityFile="/home/jobprd/.ssh/csu_to_elmnet" SCH05FO@ftp.elmproduction.com Connecting to ftp.elmproduction.com...

sftp> exit

You should be able to follow the appropriate restart instructions with any required approval.

Elden.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

02/06/12 Wed 14:00 Decrease the employee Customer Name field.

Error log and follow up comments:

2013-02-06 17:26:37,358 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

The DV_CNTCT_PRSN_NM was Stipend for participation in Upward Bound program for Block 1 Sep12-Dec12 970-491-3551, this concated with “Info:” at the end created a string that was longer than 90 characters and caused the error. This particular error was not caused by a special character has seen in above entries.

The resolution was to shorten the person nm by 2 characters. In this case the area code for the phone number was removed.

Documentation will be created to check for those scenarios and are listed below:

All of these have the potential of causing the note to be over 90 characters.

pnt.setCustomerNoteText("Info: " + document.getDisbVchrContactPersonName() + " " + document.getDisbVchrContactPhoneNumber());

pnt.setCustomerNoteText("Send Check To: " + dvSpecialHandlingPersonName);

pnt.setCustomerNoteText(dvSpecialHandlingLine1Address);

pnt.setCustomerNoteText(dvSpecialHandlingLine2Address);

pnt.setCustomerNoteText(dvSpecialHandlingCity + ", " + dvSpecialHandlingState + " " + dvSpecialHandlingZip);

pnt.setCustomerNoteText("Attachment Included");

pnt.setCustomerNoteText("Reimbursement associated with " + dvnet.getDisbVchrServicePerformedDesc());

pnt.setCustomerNoteText("The total per diem amount for your daily expenses is " + dvnet.getDisbVchrPerdiemCalculatedAmt());

pnt.setCustomerNoteText("The total dollar amount for your vehicle mileage is " + dvnet.getDisbVchrPersonalCarAmount());

pnt.setCustomerNoteText(exp.getDisbVchrExpenseCompanyName() + " " + exp.getDisbVchrExpenseAmount());

pnt.setCustomerNoteText("Payment is for the following individuals/charges:");

pnt.setCustomerNoteText(dvpcr.getDvConferenceRegistrantName() + " " + dvpcr.getDisbVchrExpenseAmount());

Execute the below script to search for bad characters and long customer notes for the names and phone numbers.

G:\DOC\KFS\Production_Fixes\Production Recovery\KFSXFPPD_find.sql

Josh.

Aborted Module Name: KFSXTXW2.SEND_MAIL_01

Date: Day: Time: Resolution:

02/01/12 Wed 10:03 See follow up below.

Error log and follow up comments:

# FATAL : Error opening file (/ais01/bkp/KFSXTXW2.HRMSS244_01.RPT) : A file or directory in the path name does not exist.

The HRMSS244 step was skipped because the java program did **NOT** create a 1099 file, which must be used as the input to HRMSS244. I guess we’ve never had this situation occur before – nor, apparently, did we expect it because there is no logic to stop the SEND_MAIL component from running in this situation. Ironically, we do have logic in place to skip the CHAIN_SQL_INIT and HRMSS244 components if no 1099 file – just didn’t have that logic on the SEND_MAIL? SEND_MAIL is failing because it doesn’t find the “report” that would have been created from HRMSS244, which in this case didn’t even run!

Solution will require determining why java program, electronicFilingStep, created no output file? And then – rerun KFSXTXW2 from the beginning .

If it helps, there was an exceptions file created from electronicFilingStep – attached to this email and also available in Kebler file: /ais01/bkp/KFSXTXW2.7571154.1099_exc.csv

Janice.

Yes, this csv file contains a critical error message indicating that a business validation failed which has prevented the file from being created. This is a vital piece of information.

I’d like to challenge Gudrun/Dermot/Steve to remember this for next year.

I am working with BFS to fix the problem. It looks like it will require a Java coding change that cannot take place until next Wednesday. Can someone please cancel/delete this job flow/chain until we are ready?

John.

I’d like to challenge Dermot/Steve to be sure this is documented in KFSXTXW2 Process Flow notes. Even better… maybe we should create a follow-up Incident, whereby the process flow is modified to send out an alternate email, to which the exceptions file would be attached and indicate that the 1099 file did not get created. This would actually be easy to do if we just added some BEFORE conditions to SEND_MAIL component to set value in a new chain specific subvar, #which_rpt_{chain_id} as follows:

When {#kfsx_1099_{chain_id}} = Y, then set #which_rpt_{chain_id}={#bkp}/{#1}.HRMSS244_01.RPT

When {#kfsx_1099_{chain_id}} != Y, then set #which_rpt_{chain_id}={#mailst}/SEND_MAIL.KFSXTXW2.NO_1099.TXT

Then use this new subvar {#which_rpt_{chain_id}} as the value for the SEND_MAIL prompt #12, rather than the current value of {#bkp}/{#1}.HRMSS244_01.RPT

Create /ais01/dat/misc/mailst/SEND_MAIL.KFSXTXW2.NO_1099.TXT file to contain something like this:

***ERROR*** electronicFilingStep did NOT create a 1099 file.

See attached exceptions file for possible reasons that 1099 file creation did not occur.

The solution described above would be cleaner – SEND_MAIL component would not fail due to missing {#bkp}/{#1}.HRMSS244_01.RPT file **and** hopefully would get troubleshooting going in the right direction via the examiniation of the exceptions file that will be attached to email.

Anyone care to pursue this alternative?

If we hurry to implement above, we’d have the perfect testing opportunity in production right now.

Janice.

Completed Clarity Task (T08160).

Dermot.

Aborted Module Name: KFSXCS53.SSH_EXEC_01

Date: Day: Time: Resolution:

02/04/12 Sat 00:12 See note from Janice below.

Error log and follow up comments:

02/04/2012 11:33 JMWILKIN

Just taking a quick look at Appman - making sure some recent Faid process flow renames working okay.

Faid schedule had already completed, but I did notice the KFSXCS53.SSH_EXEC_01 failure. KFSXCS53_CSU_LOAD_ID_ATTACH process flow runs weekly on Friday.

Error indicates that appworx user is not authorized to execute SSH_EXEC script.

Sure enough, between the previous Friday's run and yesterday's run, the following entry had been removed from the authoriz.list file:

appworx /appworx/csu/exec/SSH_EXEC.PL 1

I put this entry back into the authoriz.lis file and restarted the component, which has now successfully completed. KFSXAM99 "schedule done" should now be able to proceed once the remaining components of KFSXCS53 have completed.

Aborted Module Name: AREGORGN.AREGS411_01

Date: Day: Time: Resolution:

03/04/13 Mon 17:01 Restated by Joleen.

09/25/13 Wed 09:38 Restated by Steve.

Error log and follow up comments:

17:01:15 295 v_api_count := v_api_del_count + v_api_add_count;

17:01:15 296

17:01:15 297 if v_api_count > 200 then

17:01:15 298 raise_application_error(-20500,'Error count Exceeded 200');

17:01:15 299 end if;

17:01:15 300

Problem Inserting the Alt Pin record for:

A,201390,821045156,ADVR,841804

error is: ORA-20100: ::Cannot create, record already exists::

This error appears 201 times in the utl file for 201 different records

Sue fixed the input file. I restarted AREGORGN.AREGS411. It has finished running.

Joleen.

09/25/13.

09:38:12 294

09:38:12 295 v_api_count := v_api_del_count + v_api_add_count;

09:38:12 296

09:38:12 297 if v_api_count > 200 then

09:38:12 298 raise_application_error(-20500,'Error count Exceeded 200');

09:38:12 299 end if;

I’ve ftp’d a new file. Hope this one works.

Can you try the AREGORGN/AREGS411 again.

Denise.

Aborted Module Name: KFSXFPPC.KFSX_JAVA_04

Date: Day: Time: Resolution:

02/29/12 Wed 19:16 Restarted by Dermot.

07/18/12 Wed 19:18 Restarted by Dermot.

Error log and follow up comments:

02/29/12.

at org.springmodules.orm.ojb.PersistenceBrokerTemplate.execute(PersistenceBrokerTemplate.java:141)

at org.springmodules.orm.ojb.PersistenceBrokerTemplate.store(PersistenceBrokerTemplate.java)

... 131 more

2012-02-29 19:21:34,975 [main] INFO edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.7723638.7723649.00 steps: [procurementCardRouteDocumentsStep]

2012-02-29 19:21:34,975 [main] INFO edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.kuali.rice.kew.exception.WorkflowRuntimeException: java.lang.RuntimeException: post processor caught exception while handling route status change: OJB operation; uncategorized SQLException for SQL []; SQL state [61000]; error code [60]; ORA-00060: deadlock detected while waiting for resource

; nested exception is java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource

RunBatch ERROR: Exception found:

This step ABORTED due to both KFSXFPPC and KFSXFPAA trying to access the same table at precisely the same time.

Dermot.

Please restart this module.

Josh.

07/18/12.

2012-07-18 19:18:48,031 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springf

ramework.dao.DataAccessResourceFailureException: Could not open OJB PersistenceBroker; nested exception is org.apa

che.ojb.broker.PBFactoryException: Transaction synchronization failed - wrong status of external JTA tx. Expected

was an 'active' or 'no transaction', found status is 'STATUS_MARKED_ROLLBACK'

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

As per instructions from John, the job was restarted and completed successfully.

Dermot.

Aborted Module Name: KFSXCS14.KFSXS011_01

Date: Day: Time: Resolution:

03/01/12 Thu 07:45 Restarted by Dermot.

Error log and follow up comments:

ORA-20001: Error in KFSX011.sql: -20001 -ERROR- ORA-20001: Error in

KFSX011.sql: -60 -ERROR- ORA-00060: deadlock detected while waiting for

resource

ORA-06512: at line 537

07:45:25 519 IF fringe_error_count > 1 THEN

07:45:25 520 DBMS_OUTPUT.PUT_LINE ('-');

07:45:25 521 DBMS_OUTPUT.PUT_LINE ('WARNING '|| to_char(fringe_error_count) || ' Fringe Distribution errors found');

07:45:25 522 DBMS_OUTPUT.PUT_LINE ('-');

07:45:25 523 END IF;

07:45:25 524 DBMS_OUTPUT.PUT_LINE ('Total GL count: ' || to_char(tot_gl_count) );

07:45:25 525 DBMS_OUTPUT.PUT_LINE ('Total Fringe count: ' || to_char(tot_fringe_count) );

07:45:25 526 DBMS_OUTPUT.PUT_LINE ('Total Cap Const CSU cash split: ' || to_char(csu_cash_count) );

07:45:25 527 DBMS_OUTPUT.PUT_LINE ('Total Cap Const State cash split: ' || to_char(state_cash_count) );

07:45:25 528 DBMS_OUTPUT.PUT_LINE ('Total Cap Const Fed cash split: ' || to_char(fed_cash_count) );

07:45:25 529 DBMS_OUTPUT.PUT_LINE ('Total Staging Records Created count: '|| to_char(stage_count) );

07:45:25 530 DBMS_OUTPUT.PUT_LINE

07:45:25 531 ('**** End of KFSXS011 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

07:45:25 532

07:45:25 533 EXCEPTION

07:45:25 534 WHEN OTHERS THEN

07:45:25 535 DBMS_OUTPUT.PUT_LINE

07:45:25 536 ('**** Exception Encountered on gl_entry_t: '||current_gl_rec_string|| ' ERROR: '||SQLERRM);

07:45:25 537 raise_application_error(-20001,'Error in KFSX011.sql: '||current_gl_rec_string|| SQLCODE ||

07:45:25 538 ' -ERROR- '||SQLERRM);

07:45:25 539

07:45:25 540 END;

Please restart the aborted job.

Josh.

There was a conflict between 2 scripts accessing the same table at the same time which caused a “deadlock detected while waiting for resource” error and caused KFSXCS14.KFSXS011_01 to ABORT, the chain which was in contention with this script was KFSXCS12.KFSXS011_01.

Chain was restarted and completed successfully. If this ABORT happens again then we may need to add a dependency between these chains to prevent another reoccurrence.

Dermot.

Aborted Module Name: KFSXBMAI.KFSX_JAVA_01

Date: Day: Time: Resolution:

03/01/12 Thu 19:22 Files resubmitted from RMS System.

Error log and follow up comments:

org.kuali.kfs.sys.exception.ParseException: error Parsing error was encountered on line 15, column 47: cvc-datatype-valid.1.2.1: '53 0' is not a valid value for 'integer'.

Hi Ron/Tyler,

Last night the BMP account field failed. It appears that there was a bad account in one of the xml files.

I have attached xml of the bad account. I changed the file extension so it wouldn’t get pick off in the email filter.

Account number “53 0” was not a valid integer.

This entry was found in file:

KFSXBMAI.UresspC2012-03-01.xml

Do you want to resubmit the file tonight?

Let me know how you would like to proceed or if I can provide additional information.

diagnostic line number is 15.

Josh.

We have backup of the files and will resubmit.

just to be clear, we’ll correct and resubmit from our backup files – no need for you to retrieve the xml.

Ron.

We do not restart. We let it auto cancel, and Ron/Tyler will resubmit the files from their RMS system. So we merely report the problem to them.

John.

How to locate error in file.

Check Appman Chain “KFSXBMAI /COLLECT_FILES” which is the step before the ABORTED Java step.

Prompt 1 will lead you to the subvar #kfsxprod}/{module}.DRIVER.DAT now click on the subvar icon “#” & type kfsxprod & this will indicate the kebler location: /ais01/dat/kfsx/prod

Then on Kebler,

cd /ais01/dat/kfsx/prod

ls KFSXBM*

KFSXBMAI.COLLECT_FILES_01.DRIVER.DAT

cd /userfiles/Uressp/data

$ ls KFSXB*

ls: 0653-341 The file KFSXB* does not exist. If the file does not exist in this directory then go to the bkp directory.

$ cd /ais01/bkp

KFSXBMAI.UresspS2012-03-01.xml.2012_03_01_1922.bak

KFSXBMAI.UresspS2012-03-01.xml.TEMP.2012_03_01_1921.bak

<accountName>Dummy Account - do not use </accountName>

Aborted Module Name: AREGTTRN_TOD_TRANSCRIPT

Date: Day: Time: Resolution:

03/09/12 Fri 03:22 See follow up below.

Error log and follow up comments:

Here's the SFTP command that failed and the reason "Hostname and service name not provided or found" which likely indicates a problem with DNS lookup or a general network problem somewhere between CSU and iwantmytranscript.com.

...

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109" colora-88@iwantmytranscript.com

4 01:20:35-Parent: (1)Checking child process(2822572)

...

4 01:22:45-No Kill File found('/appworx/run/kill.7746386.00').

4 01:22:45-Parent: sleeping for 10 seconds.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found^M

# > Connection closed^M

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

...

Elden.

All,

It's important that some analysis occur prior to restarting any aborted components within the various Transcript process flows. We may wish to consider documenting, within the Abort Documentation, some of the errors which have been encountered (such as described in Elden's email below) to serve as a guide for handling similar aborts in the future. I believe the consensus is that such failures to connect, as shown below, would in general be "safe" to restart. Obviously, failures which occur during an SSH transfer (when a connection has been established) would require additional investigation to determine if any transfers/partial transfers had occurred and/or if any cleanup activity would need to occur prior to restarting an aborted component.

If Abort documentation is to be maintained via the AREGTTRN_TOD_TRANSCRIPT sub-process flow, then it probably would be a good idea to have a reminder within Abort Documentation for the main process flows, AREGDYTS_TRANSCRIPTS and AREGFQTR_SEND_TRANSCRIPTS, to refer to the AREGTTRN_TOD_TRANSCRIPT sub-process flow Abort Documentation.

Janice.

Aborted Module Name: KFSXTXW2.SEND_MAIL_01

Date: Day: Time: Resolution:

02/01/12 Wed 10:03 See follow up below.

Error log and follow up comments:

KFSXTXW2.SEND_MAIL_01 / KFSXTXW2_TAX_W2_AND_1099_RPT ABORTED.

# FATAL : Error opening file (/ais01/bkp/KFSXTXW2.HRMSS244_01.RPT) : A file or directory in the path name does not exist.

The HRMSS244 step was skipped because the java program did **NOT** create a 1099 file, which must be used as the input to HRMSS244.

I guess we’ve never had this situation occur before – nor, apparently, did we expect it because there is no logic to stop the SEND_MAIL component from running in this situation. Ironically, we do have logic in place to skip the CHAIN_SQL_INIT and HRMSS244 components if no 1099 file – just didn’t have that logic on the SEND_MAIL? SEND_MAIL is failing because it doesn’t find the “report” that would have been created from HRMSS244, which in this case didn’t even run!

Solution will require determining why java program, electronicFilingStep, created no output file? And then – rerun KFSXTXW2 from the beginning .

If it helps, there was an exceptions file created from electronicFilingStep – attached to this email and also available in Kebler file: /ais01/bkp/KFSXTXW2.7571154.1099_exc.csv

Janice.

Yes, this csv file contains a critical error message indicating that a business validation failed which has prevented the file from being created. This is a vital piece of information.

I’d like to challenge Gudrun/Dermot/Steve to remember this for next year.

John.

When {#kfsx_1099_{chain_id}} = Y, then set #which_rpt_{chain_id}={#bkp}/{#1}.HRMSS244_01.RPT

When {#kfsx_1099_{chain_id}} != Y, then set #which_rpt_{chain_id}={#mailst}/SEND_MAIL.KFSXTXW2.NO_1099.TXT

Then use this new subvar {#which_rpt_{chain_id}} as the value for the SEND_MAIL prompt #12, rather than the current value of {#bkp}/{#1}.HRMSS244_01.RPT

Create /ais01/dat/misc/mailst/SEND_MAIL.KFSXTXW2.NO_1099.TXT file to contain something like this:

***ERROR*** electronicFilingStep did NOT create a 1099 file.

See attached exceptions file for possible reasons that 1099 file creation did not occur.

Anyone care to pursue this alternative?

If we hurry to implement above, we’d have the perfect testing opportunity in production right now.

Janice.

Completed Clarity Task (T08160).

Dermot.

Aborted Module Name: Transcript Connection Failure

Date: Day: Time: Resolution:

03/05/12 Mon 10:03 See follow up below.

Error log and follow up comments:

...

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109" colora-88@iwantmytranscript.com

4 01:20:35-Parent: (1)Checking child process(2822572)

...

4 01:22:45-No Kill File found('/appworx/run/kill.7746386.00').

4 01:22:45-Parent: sleeping for 10 seconds.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found^M

# > Connection closed^M

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

Elden.

All,

Janice.

Aborted Module Name: KFSXCS31.KFSX_JAVA_01

Date: Day: Time: Resolution:

03/21/11 Wed 07:32 Aborted step deleted to allow process flow to continue.

Error log and follow up comments:

2012-03-21 07:53:48,892 [main] INFO org.kuali.rice.kew.docsearch.SearchableAttribute :: Indexing document 1764269 for document search...

2012-03-21 07:53:49,325 [main] INFO org.kuali.rice.kns.util.MaintenanceUtils :: starting checkForLockingDocument (by MaintenanceDocument)

2012-03-21 07:53:49,330 [main] ERROR org.kuali.rice.kew.docsearch.SearchableAttribute

:: Encountered an error when attempting to index searchable attributes, requeuing.

java.lang.NumberFormatException: can't convert infinity or NaN

at java.math.BigDecimal.<init>(BigDecimal.java:574)

at java.math.BigDecimal.<init>(BigDecimal.java:541)

at org.kuali.rice.kns.util.AbstractKualiDecimal.<init>(AbstractKualiDecimal.java:61)

at org.kuali.rice.kns.util.KualiDecimal.<init>(KualiDecimal.java:60)

at org.kuali.kfs.module.cam.util.KualiDecimalUtils.safeMultiply(KualiDecimalUtils.java:148)

at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.setupAsset(AssetGlobalServiceImpl.java:455)

at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.getSeparateAssets(AssetGlobalServiceImpl.java:408)

at org.kuali.kfs.module.cam.businessobject.AssetGlobal.generateGlobalChangesToPersist(AssetGlobal.java:676)

at org.kuali.rice.kns.workflow.attribute.DataDictionarySearchableAttribute.findAllSearchableAttributesForGlobalBusinessObject(DataDictionarySearchableAttribute.java:394)

Document 1764269 is an Asset Global document.

I sent an email to Theresa/Debra who work with Assets.

The Abort log shows these lines:

at org.kuali.kfs.module.cam.util.KualiDecimalUtils.safeMultiply(KualiDecimalUtils.java:148)

at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.setupAsset(AssetGlobalServiceImpl.java:455)

When I checked java program AssetGlobalServiceImpl at line 455. I can see that it is trying to set the Salvage amount to a decimal. So therefore I am requesting that Theresa/Debra clean up the decimal field on the Asset Global. Hopefully this will be the solution, otherwise we will get the abort again tomorrow.

John.

Aborted Module Name: CLMSACCT.CLMS_URL_EXEC_01

Date: Day: Time: Resolution:

03/23/12 Fri 17:31 See follow up below.

Error log and follow up comments:

StackTrace:

at System.Net.HttpWebRequest.GetResponse()

at ClmHttp.Post(String url, String post_data) in d:\Program Files\SCT\clm\Application\App_Code\ClmHttp.cs:line 41

at ASP.interfaces_outbound_accounting_feed_extract_aspx.page_load() in d:\Program Files\SCT\clm\Application\interfaces\outbound_accounting_feed_extract.aspx:line 10

Message:

The remote server returned an error: (500) Internal Server Error.

==================================================================

=== lynx.stderr ====================================================

=== lynx.status ====================================================

URL=http://clm.colostate.edu/clm/interfaces/outbound_accounting_feed_extract.aspx (GET)

STATUS=HTTP/1.1 200 OK

==================================================================

[100] : *** ERROR : Status (error) Returned ***

rm: Removing lynx.581856/lynx.status

rm: Removing lynx.581856/lynx.stderr

rm: Removing lynx.581856/lynx.stdout

rm: Removing directory lynx.581856

#== Exiting /appworx/csu/exec/CLMS_URL_EXEC.SH [prod acct_feed_extract ] (100) ============

+ err=100

I think there may have been some connectivity issues. I connected to CLM manually this morning ok so hopefully we’re good now. Can you restart the job?

Steven Dove.

I restarted the component and it aborted again. It looks like the same error as before. L

Joleen.

I think this job is failing because it isn’t finding any records to process. I checked with Trish and she said it would be ok if we skip this job so our production schedule can continue.

Steven Dove.

Aborted Module Name: HRMSDED_HRL.HRMSS061_11

Date: Day: Time: Resolution:

04/02/12 Mon 16:00 Deleted by Joleen.

Error log and follow up comments:

+ print *** \n*** HRMSR316 EXTRACT UNSUCCESSFUL - ABORT \n***

***

*** HRMSR316 EXTRACT UNSUCCESSFUL - ABORT

***

+ exit 100

The extract was probably unsuccessful because the report (HRMSR316) did not return any data.

Steve H.

We deleted HRMSDED_HRL.HRMSS061_11.

Joleen.

Aborted Module Name: HRMSKFS_QPH.HRMSS175_01

Date: Day: Time: Resolution:

04/02/12 Mon 23:19 See note from Elden below.

Error log and follow up comments:

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_deltadentalco-4096-20100914" CSU1@transfer.deltadentalco.com

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

# > Someone could be eavesdropping on you right now (man-in-the-middle attack)!

# > It is also possible that the RSA host key has just been changed.

# > The fingerprint for the RSA key sent by the remote host is # > 55:03:77:0a:88:d1:92:87:7f:bb:d9:bb:1f:ca:eb:87.

# > Please contact your system administrator.

# > Add correct host key in /home/jobprd/.ssh/known_hosts to get rid of this message.

# > Offending key in /home/jobprd/.ssh/known_hosts:30 # > RSA host key for transfer.deltadentalco.com has changed and you have requested strict checking.

# > Host key verification failed.

# > Connection closed

# > (255)

I’m glad you deleted the failed SSH. It looks like they changed their server or server key. The error you’re seeing is a security safety mechanism to reduce the risk of a man-in-the-middle attack. We should check with our HR team to verify if Delta Dental sent out a notice or if they did indeed change their server.

Once we are satisfied that this is a legitimate server for Delta Dental, then log into jobprd:

• comment out (prefix the line with ‘## ‘) the current entry for Delta Dental from jobprd’s ~/.ssh/known_hosts file

o change ‘transfer.deltadentalco.com,205.169.191.2…’ to ‘## transfer.deltadentalco.com,205.169.191.2’

• manually connect to Delta Dental:

o /usr/bin/sftp -oIdentityFile="/home/jobprd/.ssh/csu_to_deltadentalco-4096-20100914" CSU1@transfer.deltadentalco.com

• accept the new server key from Delta Dental

Elden.

Aborted Module Name: HRMSACH_QPS.HRMSR218_01

Date: Day: Time: Resolution:

04/03/12 Tue 08:18 See note from David below.

Error log and follow up comments:

HRMSACH_QPS.HRMSR218_01 has a DB ERROR.

2012-04-03 08:13:49 Prompt 21 changed from "{#{#1}_{#2}_CONSUB_REQ_NO}" to "{#HRMSACH_QPS_CONSUB_REQ_NO}" by OSU=appworx JDBC Thin Client

2012-04-03 08:18:30 java.sql.SQLException: ORA-20025: No role access to "Agent as "AWPROD" rtype=N edit=N my_dba=N my useq=115*-5-6-9-11"

ORA-06512: at "APPWORX.AWAPI2", line 4436

ORA-06512: at "APPWORX.AWOP_API", line 675

ORA-06512: at "APPWORX.AW5", line 2467

ORA-06512: at line 1

aw5.aw_condition_action

0 jobid: IN:NUMERIC:java.math.BigDecimal:7927155

1 condition_order: IN:NUMERIC:java.math.BigDecimal:3

2 action: IN:VARCHAR2:java.lang.String:REQUEST JOB

3 performed: IN:OUT:VARCHAR2:java.lang.String:N

4 actionArg: IN:VARCHAR2:java.lang.String:-m HRMSARC_PAYROLL_ARCHIVE -u APPWORX -o AWPROD -q HRMS -f STORE -arg HRMSARC QPS QUICK _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_

5 results: OUT:NUMERIC::null

6 text: OUT:VARCHAR2::null

The actual error was java.sql.SQLException: ORA-20025: No role access to "Agent as "AWPROD". Per Greg I retried again by re-setting the BEFORE condition to submit HRMSARC back to “ONCE” and then re-starting the failed component. It completed okay the second time.

David.

Aborted Module Name: AREGTTRN.RWCLIENT_01

Date: Day: Time: Resolution:

04/10/12 Tue 07:09 See follow up below.

08/20/12 Mon 16:26 Mark. B restarted report server.

Error log and follow up comments:

04/10/12.

<<errtrap_ssh.6>> print *** \n*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED /ais02/log/AREGTTRN.RWCLIENT_01.7972115.7972129.01.2012_04_10_0709.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

We’ve contacted the oncall dba (Shawn) to request that the report server be recycled.

Janice.

Does the report server need to be rebooted?

Vicki.

I’ve just restarted the report server. Please let me know if you encounter any further issues.

Shawn.

Just a FYI…I’m working on a way to detect this and automatically restart the report server. Hopefully I’ll have that in very soon.

Mark B.

08/20/12.

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED /ais02/log/AREGTTRN.RWCLIENT_01.8821471.8821485.01.2012_08_20_1626.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

+ cat /ais02/log/AREGTTRN.RWCLIENT_01.8821471.SEND_MAIL_ERR.DAT

REP-0177: Error while running in remote server

Unable to connect to the specified database.

Mark B. restarted the report server and the job has completed successfully

David.

Aborted Module Name: ADMSBDMS_APPL_RWCLIENT

Date: Day: Time: Resolution:

04/10/12 Tue 12:06 See follow up below.

Error log and follow up comments:

I see that this and ADMSR207, 209, 263, 269 all failed. Can you let me know if someone is working on getting these resolved and restarted and if there is anything we need to do on our end.

Marcella.

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Tuesday, April 10, 2012 12:06 AM

To: ADM Systems; IS DL: Alert ADMS

Cc: ADM Systems; IS DL: Alert ADMS

Subject: ADMSR206 FAILED

*** Oracle Report: ADMSR206

Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: ADMSBDMS_APPL_RWCLIENT

*** Oracle Instance: banprod

*** Report Parameters Used:

LAST_REPORT_DATE=20120406221634

SENSITIVE_INFO=YES

*** Report Errors:

REP-0177: Error while running in remote server

OCI_INVALID_HANDLE. ==> --using a union so that the print_sensitive_info parameter will only select the W5 applications

David has replaced the file and reran ADMSBDMS. You should have your reports now.

Joleen.

Aborted Module Name: OSYSJOBS_08.OSYSLLNK_01

Date: Day: Time: Resolution:

03/25/12 Sun 16:37 Restarted by Robin.

Error log and follow up comments:

Here's more about the error:

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> find / -name tmp -prune -o -name proc -prune -o -type l -ls

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_02.OSYSPURG_01.7869420.7869423.00_orahomes is not valid.

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> errtrap_ssh /ais02/job/prod/sys_llnk_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

We sometimes see these "status not valid" errors with the find command, which tend to be just a timing thing in that a "temp" file existed when the find command "found" it, and then was subsequently deleted, which confuses the find command.

Please restart the failed OSYSJOBS_11.OSYSLLNK_01 component.

Same deal on the failed OSYSJOBS_08.OSYSLLNK_01 - please restart it too.

Janice.

Aborted Module Name: AROSDBIO.AROSS141_01

Date: Day: Time: Resolution:

04/05/13 Fri 18:05 Fixed data in TWRCUST and restarted.

Error log and follow up comments:

AROSDBIO.AROSS141 aborted in AWPROD for Friday’s schedule. I would prefer to get this one fixed before AppMan is being shutdown due to DB server changes this Sunday.

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: ::Calculated age cannot be

longer than 3 digits. Please check the birth date.::

ORA-06512: at line 104

Gudrun.

This is a data issue in Twarbus. Someone entered the bday incorrectly.

Please restart the process. The birth date year was 0992, I am assuming that it should be 1992.

Don’t think anyone making twarbus transactions was born over 1000 years ago.

Someone should verify the birthday, the TWRCUST/Spriden ID is: 830163783

There is no reason to hold up the schedule for a typo or get more people involved on the weekend.

AROSS141 captures the error and printed the following line to the UTL file

Insert failed: 830163783 Lashley -20100 ORA-20100: ::Calculated age cannot be longer than 3 digits. Please check the birth date.::

NOTES:

AROSS141 is a very small program that uses the CSUT_BPI_PROCESS_TWRCUST package

- You can execute the package’s cursor cur_newcust to pull the data that AROSS141 is processing and look for a bad birth date (Year was 0992 instead of 1992)

- If the UTL file exists, then it pointed to the row in TWRCUST that had the bad birth year

- CSUT_BPI_PROCESS_TWRCUST package actually calls the gb_identification, gb_bio, gb_address, gb_telephone, etc Banner APIs to create a new customer/person in Banner

Resolution:

Short Term

Modified the AROSS141 to handle the data problem and put in TEMP or could have

Write an update statement to fix the bad data and call the DBA on-call to run it in BANPROD, then have the program restarted

Long Term

Modify TWARBUS to check the birthdate so that it will pass the same edits in GB_BIO that the “program” failed on (see GB_BIO_RULES) so that this data entry error will be caught by the form not when AROSS141 runs

--If no dead date, at least check that age <1000 (3 digit limit)

IF p_birth_date IS NOT NULL AND p_dead_date IS NULL AND

trunc(( sysdate - p_birth_date)/365)>999 THEN

p_build_error('BIRTH_DATE_OUT_OF_RANGE');

END IF;

Josh.

Aborted Module Name: FAIDDLIM_EV.RERIM12_01

Date: Day: Time: Resolution:

04/25/12 Fri 06:26 Restarted by Joleen.

Error log and follow up comments:

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

After talking to the user, David removed crdl12op.2012_04_24_0006.bak.dat from the ais02 directory. I restarted FAIDDLIM_EV.RERIM12_01. FAIDDLIM_DIRECT_LOAN_IMPORT is now complete.

Joleen.

Aborted Module Name: FAIDSEND.TDCLIENT_SEND_01

Date: Day: Time: Resolution:

04/30/12 Mon 11:44 See follow up below.

Error log and follow up comments:

+ 1>> /ais01/dat/work/prod/FAIDSEND.TDCLIENT_SEND_01_jobstat

+ cat /ais01/dat/work/prod/FAIDSEND.TDCLIENT_SEND_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open /userfiles/Ufaid/data/crdl13in_2643924.xml.

FYI

There is a file with this name, crdl13in_2643924.xml, which exists in /oraapps/BANPROD/out/, which was created: 12/04/30 11:36

Did the user run something within gurjobs (via Banner application) to create this file… and now they want it sent?

Do they want us to copy this file to /userfiles/Ufaid/data so they can review it before transmittal?

Janice.

I worked with Karma to fix this.

David.

Aborted Module Name: AREGDYGN.AREGS301_01

Date: Day: Time: Resolution:

05/02/12 Wed 20:05 See follow up below.

Error log and follow up comments:

ORA-06512: at line 121

20:05:08 119 BEGIN

20:05:08 120 --Update SORLFOS for NULL dept codes--

20:05:08 121 UPDATE sorlfos a

20:05:08 122 SET sorlfos_dept_code = (SELECT swblfos_dept_code

20:05:08 123 FROM swblfos

20:05:08 124 WHERE swblfos_majr_code = a.sorlfos_majr_code

20:05:08 125 AND swblfos_activity_date =

20:05:08 126 (SELECT MAX(swblfos_activity_date)

20:05:08 127 FROM swblfos

20:05:08 128 WHERE swblfos_majr_code = a.sorlfos_majr_code)),

20:05:08 129 sorlfos_activity_date = SYSDATE,

Please run the query below and figure out what the problem is (too many rows) and see if Jerry or his folks can't resolve the issue.

Vicki.

I took a look at the problem and saw that in SWBLFOS there were two entries made yesterday for the IEAQ MAJR_CODE, one by Jerry and one by Sue. They may have different start and end terms, but AREGS301 doesn't take that into account, it goes by the MAX Activity Date - one has to go for now. And I'm guessing AREGS301 may have to change to take the term into account.

Peter.

I have deleted the offending record.

Jerry

I restarted AREGS301 and AREGDYGN_DAILY_GENERAL has completed.

Joleen.

If I re-enter the offending record today would AREGS301 run to completion? The activity dates for the two records should be different...

Jerry.

A question I have is which Department Code would you like to get picked up when AREGS301 runs since it would get the record with the MAX Activity Date. The Department Code that is associated with the record that was entered yesterday, 1784, would never get picked up if you entered another record today.

Peter.

Ok - that would not be good... I'll hang onto the transaction until AREGS301 is modified or fall 2012 starts.

Jerry.

Aborted Module Name: KFSXCS31.KFSXS052_01

Date: Day: Time: Resolution:

05/14/12 Mon 07:23 Restarted by Dermot.

Error log and follow up comments:

EORA-01555: snapshot too old: rollback segment number 34 with name

"_SYSSMU34_400256035$" too small

ORA-02063: preceding line from KRPRD@KRUSER

ORA-06512: at line 168

07:23:31 160 Begin

07:23:31 161

07:23:31 162 --Document the program has started.

07:23:31 163 DBMS_OUTPUT.PUT_LINE

07:23:31 164 ('**** Start of KFSXS052 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

07:23:31 165

07:23:31 166 file_handle1 := UTL_FILE.FOPEN (utlpath, outfile1, 'W');

07:23:31 167

07:23:31 168 For doc_rec in C1 Loop

07:23:31 169 IF v_written < v_max_count Then

07:23:31 170 outdata := doc_rec.fdoc_nbr;

07:23:31 171 UTL_FILE.PUT_LINE(file_handle1, outdata);

07:23:31 172 v_written := v_written + 1;

07:23:31 173 End if;

Shawn increased the space on Rice, will you restart the job?

Josh.

Aborted Module Name: CLMSDATA.SSH_EXEC_01

Date: Day: Time: Resolution:

05/30/12 Wed 05:44 See follow up below.

Error log and follow up comments:

# 2012.05.30-05:52:19 : > Description: An error occurred with the following error message: "Mailbox unavailable. The server response was: 5.1.1 <heidi.kerr@colostate.edu>... User unknown".

# 2012.05.30-05:52:19 : > End Error

Heidi Kerr is no longer with CSU.

Please replace her email with Celeste Ulland (Celeste.Ulland@Colostate.EDU).

Phil.

This email is actually stored in the procedure (SQL package) we are executing remotely. However, it is not the only warning/error in the script. Ultimately, the script failed with:

# 2012.05.30-05:52:19 : > Warning: 2012-05-30 05:52:19.50^M

# 2012.05.30-05:52:19 : > Code: 0x80019002^M

# 2012.05.30-05:52:19 : > Source: BannerUpdateDriver ^M

# 2012.05.30-05:52:19 : > Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the n

umber of errors raised (1) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the nu

mber specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.^M

# 2012.05.30-05:52:19 : > End Warning^M

# 2012.05.30-05:52:19 : > DTExec: The package execution returned DTSER_FAILURE (1).^M

Elden.

FYI – just to add to Elden’s conclusion

Here is the command that is being remotely executed once ssh connection has been established. We can’t make any changes to those files.

cmd.exe /V:ON /C

"D:\Program Files\SCT\clm\DtsxPackages\dtexec_wrapper.cmd"

"file=D:\Program Files\SCT\clm\DtsxPackages\BannerUpdateDriver.dtsx"

"config=C:\Users\sshuser\.ssh2\BannerUpdateDriver.cfg"

Gudrun.

I’ve updated the email list in the DTSX package and copied to the CLM server.

Can you add me to the IS DL: Alert CLMS HOUS group?

~Steven Dove.

We do not add non-IS staff to these lists. Instead, we request that the user department set up their own list and then we add that list to the appropriate address file or other appropriate object for the process flows. We do not want to be responsible for maintaining user address lists. Janice has gone through a lot of work to support this methodology.

Elden.

There are two potential lists that you could be added to:

BFS_CLM_FTP (used for CLMSSEND_NSLDS_DATA and CLMSSGET_NSLDS_ERROR_FILE) BFS_CLM_PRODUCTION (used for CLMSAM99_CLMS_SCHEDULE_DONE) I believe that the BFS personnel are responsible for updating these lists.

Also, you should also remove Jan Mueller (retired)and Phil Chambers from the BFS_CLM_FTP list.

We also noticed that Heidi Kerr is on this list.

David.

Chris Glaze has updated the lists. Thanks for pointing me in the right direction.

~Steven Dove.

Aborted Module Name: KFSXCS31.KFSXS052_01

Date: Day: Time: Resolution:

09/29/12 Fri 22:11 Restarted by Dermot.

Error log and follow up comments:

ERROR at line 1:

ORA-01555: snapshot too old: rollback segment number 15 with name

"_SYSSMU15_2222981817$" too small

ORA-02063: preceding line from KRPRD@KRUSER

ORA-06512: at line 168

Please restart the module.
Thanks
Josh.

Aborted Module Name: FAIDALIM.SSH_SFTP_RN_02

Date: Day: Time: Resolution:

10/25/12 Thu 07:02 Restarted by Joleen.

Error log and follow up comments:

# > Permission denied (password,gssapi-with-mic).

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

I just tested the connection and it worked.

Since this job originally failed during the connection, unless the process flow and job notes suggest otherwise, you may reset the job.

Elden.

There were no notes and no conditions. I reset the job. FAIDALIM_ALTERNATE_LOAN_IMPORT has finished running.

Joleen.

Aborted Module Name: KFSXPDCA.KFSX_JAVA_01

Date: Day: Time: Resolution:

07/17/12 Tue 19:10 Requested in again by Dermot.

Error log and follow up comments:

2012-07-17 19:10:42,431 [RMI TCP Connection(9)-129.82.127.238] FATAL org.kuali.rice.core.database.KualiTransaction

Interceptor :: Exception caught by Transaction Interceptor, this will cause a rollback at the end of the transacti

on.java.lang.NullPointerException

2012-07-17 19:10:42,267 [RMI TCP Connection(9)-129.82.127.238] INFO org.kuali.rice.kns.service.impl.DocumentServi

ceImpl :: storing document 1919369

2012-07-17 19:10:42,672 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: java.lang.N

ullPointerException

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

<</ais02/job/prod/kshexe_ssh.90>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

As per John, It appeared that there was a glitch with RICE while this JAVA step was running: processPdpCancelsAndPaidStep

Process Flow / Chain was restarted the next morning and completed successfully.

If this ABORT happens again then check that format checks "KFSXPDDR" is not running before requesting the KFSXPDCA Process Flow.

Dermot.

Aborted Module Name: APMXMISC.APMXPURG_01

Date: Day: Time: Resolution:

06/27/12 Mon 13:35 Restarted by Joleen.

Error log and follow up comments:

rm: 0653-609 Cannot remove ./obs/COMPLETION.ACRD.20081031_092518.

The file access permissions do not allow the specified action.

rm: 0653-609 Cannot remove ./obs/FTP_EPRINT.SH.20030818_104042.OBS.

The file access permissions do not allow the specified action.

rm: 0653-609 Cannot remove ./obs/FTP_EPRINT.SH.20050505_131713.OBS.

The file access permissions do not allow the specified action.

What is the status on this jobs directory ? Do we want it to be part of the regular APMXPURG removal of files in /appworx/csu/exec ?

Gudrun.

Just a permissions issue. The apmxpurg script cleaned up timestamps for the first time today. Greg fixed the permissions issue for the obs directory.

David.

Aborted Module Name: APMXLOOK_AM.SEND_MAIL_04

Date: Day: Time: Resolution:

08/20/12 Mon 08:00 Deleted by David.

08/04/14 Mon 08:00 Deleted by David.

Error log and follow up comments:

08/20/12.

# --> --options=" ERROR -999 ORA-01722: invalid number"

# --> --options=""

# > (3)

#==============================================================================

# FATAL : Command failed with code : 3

#------------------------------------------------------------------------------

# RETURN CODE = 100 (/appworx/csu/exec/build_parms_with_multiselect.pl)

#==============================================================================

# Passing Parms : arg=[ from="jobprd@mailer.is.colostate.edu" reply_to="/ais01/dat/misc/mailst/SEND_MAIL.IS_SUPPORT_SCHEDULING.LST" to="Jobs" cc="-" bcc="Production" subject="Jobs" --options=" ERROR -999 ORA-01722: invalid number" --options=""] /usr/bin/perl /appworx/csu/exec/SENDMAIL.PL from="jobprd@mailer.is.colostate.edu" reply_to="/ais01/dat/misc/mailst/SEND_MAIL.IS_SUPPORT_SCHEDULING.LST" to="Jobs" cc="-" bcc="Production" subject="Jobs" --options=" ERROR -999 ORA-01722: invalid number" --options=""

#==============================================================================

# [ 2012.08.20-08:00:30 ]

#******************************************************************************

# FATAL : < main::parse

# FATAL : Unknown option ( ERROR -999 ORA-01722: invalid number)

I deleted this. This will hopefully be fixed next time it runs. I added account jobprd to the /ais01/dat/apmx/prod/APMXLOOK_EMAIL_OVERRIDE.DAT file to send the Email to APMX developers if the owner of the temp file is jobprd.

David.

08/04/14.

Mon Aug 04 08:05:30 MDT 2014 Page 1

Check Backlog for ABORTED jobs (so_status 202)

Job Chain Id Start Date Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

------------------------ -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

APMXLOOK_AM.SEND_MAIL_04 08-04-2014 08:00:46 MDT 202 ABORTED 279

I fixed APMXLOOK_AM.SEND_MAIL_04.

I added Steve Greene to the /ais01/dat/apmx/prod/APMXLOOK_EMAIL_OVERRIDE.DAT file so his Email would be derived correctly in the future.

I had to manually fix the prompt in the APMXLOOK_AM.SEND_MAIL_04 for this one and re-start.

David.

Aborted Module Name: APMXLOOK_AM.SEND_MAIL_01

Date: Day: Time: Resolution:

03/18/13 Mon 08:00 see note from Steve below.

Error log and follow up comments:

# - Sending Message

# MIME::Lite version : 3.027

# MAIL COMMAND : smtp.colostate.edu , Debug => '0', Timeout => '60'

# BUILDING HEADERS

# BUILDING BODY

SMTP recipient() command failed:

5.2.2 mail delivery suspended,mailbox full

error is 255

===== Exiting PERL_CSU =====

+ err=255

I removed the CC Email address and it finished successfully.

Elden, it looks like one of these mailboxes is full:

eflick@lamar.colostate.edu

eflick@mail.colostate.edu

elden.flick@colostate.edu

Steve.

The lamar mailbox was showing mostly empty, but evidently was only marking files as deleted instead of deleting them! It should be good for now.

Elden.

Aborted Module Name: ODSRKFSX.ODSRS002_01

Date: Day: Time: Resolution:

09/25/12 Tue 00:53 See follow up below.

Error log and follow up comments:

Can you send me the location of the log file from whatever job it was that you called me about? I see no email yet with the job number or anything.

Mark. B.

Just sent

Gudrun.

Can you restart this job easily? I can’t really see anything wrong at this point.

Mark. B.

If it can’t be resolved at this point only ODS schedule will be delayed to complete given the dependencies out there for that component.

KFSX schedule actually completed already.

If you want to reset call 970 581 5577. Probably this one will wait until regular work hours I assume for being resolved.

Gudrun .

I will leave an email for Mark with my findings…somewhere we have a password issue but I can’t find the problem. Hopefully he’ll know what it is right away. After that I’m logging out and going to bed.

Mark. B.

Mark B needs to talk to Mark P during the day. ODS schedule will be delayed to complete. Other schedules moving except for EID schedule which has another abort due to be resolved during the day.

However post AGEN notify file creation. J

Gudrun .

One of you can re-start the failed chains. I have corrected the connection problems on ODS Prod.

This was caused by me yesterday afternoon while I was finishing post clone steps on ODSDevl and I accidently changed a couple of passwords on ODSProd.

Mark P.

Aborted Module Name: EIDSUPDT.HRMSS111_01

Date: Day: Time: Resolution:

09/24/12 Mon 22:24 Restarted by Robin.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

829886569 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

Bob and I took a quick look at this abort and did not see any reason for the problem.

Robin - can you please restart HRMSS111 and see if it doesn't work fine this time.

Vicki.

Aborted Module Name: ADMSLETA_ADMIT_DENY_LETTERS

Date: Day: Time: Resolution:

09/28/12 Fri 22:01 See follow up below.

Error log and follow up comments:

We had a problem with our admit/deny program so no Admit letters were generated on Friday night (for Spring, Summer & Fall semester). We are looking into this and will give you an update later today.

Could you tell us what web page this job is calling? It would be helpful in troubleshooting the problem.

Erica Burr.

URL is https://wsnet.colostate.edu/ai/appworx/RunLetters.aspx

David.

We gave you the wrong URL it should https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx.

Can you please update the Appworx job for - ADMSLETA_ADMIT_DENY_LETTERS to call https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx.

No need to rerun the job. I manually ran it and its fine.

https://wsnet.colostate.edu/ai/appworx/RunLetters.aspx - ADMSLETA_ADMIT_DENY_LETTERS

Dependency: No

Run Time: 8:00 PM

Occurs: Daily Monday - Sunday

Description: admit/deny letters.

Erica Burr.

I have replaced the URL with:

https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx

Joleen.

Aborted Module Name: HRMSS006.FTPS_CURL_01

Date: Day: Time: Resolution:

10/06/12 Sat 14:19 Restarted by Steve.

Error log and follow up comments:

# > * SSLv3, TLS handshake, Finished (20):

# > } [data not shown]

# > * SSLv3, TLS change cipher, Client hello (1):

# > { [data not shown]

# > * SSLv3, TLS handshake, Finished (20):

# > { [data not shown]

# > * SSL connection using DHE-RSA-AES128-SHA # > * Server certificate:

# > * subject: /C=US/ST=CO/O=SDNH/L=Denver/OU=300/emailAddress=netadmin@policy-studies.com/CN=www.sdnh.state.co.us

# > * start date: 2007-05-02 19:08:08 GMT

# > * expire date: 2017-05-03 02:08:07 GMT

# > * common name: www.sdnh.state.co.us (does not match 'sdnh.state.co.us')

# > * issuer: /C=US/ST=CO/O=SDNH/L=Denver/OU=300/emailAddress=netadmin@policy-studies.com/CN=www.sdnh.state.co.us

# > * SSL certificate verify result: self signed certificate (18), continuing anyway.

# > > USER CSU

# > * FTP response reading failed

# > * Closing connection #0

# > * SSLv3, TLS alert, Client hello (1):

# > } [data not shown]

# >

# > curl: (56) FTP response reading failed # > (56) #==============================================================================

# FATAL : Command failed with code : 56

I reset this and it finally finished...

Steve G.

As per Steve this ABORT with the above error message is ok to restart.

Dermot.

Aborted Module Name: HRMSCPR_SAL_HRMSS063_01

Date: Day: Time: Resolution:

10/09/12 Tue 08:22 Restarted by Robin.

Error log and follow up comments:

+---------------------------------------------------------------------------+

Start of log messages from FND_FILE

+---------------------------------------------------------------------------+

End of log messages from FND_FILE

+---------------------------------------------------------------------------+

**** Start of HRMSS063 10/09/2012 08:22:43

Amount Not Distributed: Hadrich,Joleen Moving Reimbursement 1371910

837.08

declare

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1063

Robin,

Go ahead and retry it. I added the account to the GL_CODE_COMBINATIONS table so we should be good to go.

Steve H.

Aborted Module Name: KFSXAM11.WAIT_ENCUMB_DEL_01

Date: Day: Time: Resolution:

10/11/12 Thu 01:58 Restarted by Joleen.

Error log and follow up comments:

10/11/2012 01:58 JWEARNE

I received a Page just after 1am. The message said Application manager Agent not running. I saw that KFSXAM11.WAIT_ENCUMB_DEL_01 was aborted. There was a message in Comments that said:

Agent error : timeout SeqNo 170863 Agent AWPROD Master AWPROD service AWPROD Method openFile [$SQLOPER_HOME/out/KFSXAM11.WAIT_ENCUMB_DEL_01.9122369.00.txt, false, false] : null

2012-10-11 01:55:46

There was no output file. The only KFSX job waiting to run was KFSXAM99. The only HRMS jobs waiting to run are HRMSAM99 and HRMSS033 which is waiting for 03:00. I saw in the comments query that this job had aborted on 9/13/2012.

I did a history, I saw that Gudrun had restarted the job that night, so I restarted this one and it finished.

Aborted Module Name: ADMSSRLD_DY.SQLSURLOAD-LOOP_01

Date: Day: Time: Resolution:

10/12/12 Fri 22:23 Restarted by Joleen.

Error log and follow up comments:

Component ADMSSRLD_DY.SQLSURLOAD-LOOP_01 aborted in AWPROD with below error message for letter AEML_MU3AS3 and sql executed at time. It appears other letters completed except

For AEML_MU3S3 .

Suggested Resolution Path: First answer question then act.

Questions: Is the sql ok ? If sql is not ok what is the correct sql. We need to rerun letters just for that sql. Provide a temp .dat file and reset.

If sql is ok should it have generated letters.

If not delete component let other things finish. Identify reason why failed in AppMan the sql ?

If letter should have been generated and sql is ok we need to rerun for those letters only – provide temp .DAT file with only that sql I assume. Also Identify reason why sql failed in AppMan.

SP2-0734: unknown command beginning "and c.sarc..." - rest of line ignored.

SP2-0044: For a list of known commands enter HELP

SP2-0734: unknown command beginning "and c.sarc..." - rest of line ignored.

SP2-0734: unknown command beginning "and e.gore..." - rest of line ignored.

SP2-0044: For a list of known commands enter HELP

Gudrun.

I think Bev will need to take a look at ADMSSRLD on Monday when she gets back. I'm pretty sure she will need to talk to the user about this one.

When EIDSUPDT aborts Vicki manually runs the sql section in question and many times she doesn't find an error. She has me restart and it usually finishes without aborting again. We can wait until Monday to have someone look at this if you like. I'm wondering if we should just restart since it doesn't seem to hurt anything. Also, Vicki had a family emergency and had to leave town. I'm not sure if any of my team members know EIDS as well as she does. What do you think?

Joleen.

Thanks for the info. Yes lets restart EIDSUPDT and see what happens. It was already restarted once anyway. No conditions attached. About letters I added some more information. Maybe Rob or Rami can help. At this point I think the sql is correct. Does not return anything and the component can be deleted provided the other letters got generated but waiting for confirmation on that.

Gudrun.

Aborted Module Name: AREGDYMP.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

10/14/12 Sun 22:15 Restarted by Joleen.

Error log and follow up comments:

+ spawned_module_name=VPLUS_RCAP

+ . SRC_APMX_STATUS_FOR_SPAWNED.KSH

+ set -x

+ awexe jh

+ grep 9144166

+ egrep ABORTED|CRITFAIL|C-Error

9144166.00 BATCH AREGDYMP.VPLUS_RCAPT10/14 22:16 00:00:02 ABORTED AREGDYMP_MATH_PLACEMENT_LOAD

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

This looper actually spawns another job to do the captures. The log for the failed spawned job is:

AREGDYMP.VPLUS_RCAPTURE_01.9144158.9144166.00.2012_10_14_2215.AWPROD.LOG

The error in this log is:

Continuing with data transfer.

ERROR: Reading Packet from server: error=-3

ERROR: Capture processing terminated.

ERROR: Network read error: 73; Connection reset by peer

ERROR: Reading Packet from server: error=-2

+ exit 17

error is 17

This is the same error Rich and I have been working on.

I checked if the file that RCAPTURE was trying to capture to Vista Plus was in Vista Plus -- it wasn't. I also checked the driver for the RCAP-LOOP and found it just had the one entry, so I reset they job.

I confirmed the report is in Vista Plus.

Elden.

Aborted Module Name: EIDSUPDT.EIDSS002_01

Date: Day: Time: Resolution:

10/14/12 Sun 22:15 Restarted by Joleen.

01/04/13 Fri 22:28 Restarted by Joleen.

Error log and follow up comments:

10/14/12.

I received a DBA cell call at 2:20 am. Nobody stated anything. However, I logged on and saw two job components aborted.

These have to be followed up during the daytime. EIDSUPDT will delay ODS. I made a Tracker web entry.

1. EIDSUPDT.EIDSS002_01

Processed 100 rows

declare

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 27

Gudrun.

Recently when EIDSUPDT.EIDSS002_01 has aborted, I have been asked to just restart it. After checking the Abort log and consulting with Gudrun, I restarted EIDSUPDT.EIDSS002_01 and it finished running just before 10am.

Joleen.

01/04/13.

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 23

This job aborted on Thursday with the same error. Peter worked for hours trying to figure out what the problem was. In the end we just restarted the job and it finished. I am going to restart this job and hope it finishes. It is holding up the EIDS PROD and HRMS refresh.

Joleen.

Aborted Module Name: FAIDSAIG_OD.TDCLIENT_01

Date: Day: Time: Resolution:

10/17/12 Wed 00:15 See note from Gudrun & David below.

10/18/12 Thu 01:28 See note from Gudrun & Elden below.

Error log and follow up comments:

Acknowledge AppMAN FAIDSAIG_OD.TDCLIENT_01 abort in AWPROD. Unlike yesterday night TDCLIENT_01 processing completed partially. It failed during NOTISIR processing. I am contacting David and/or Elden. Need confirmation to ONLY rerun NOTISR section

of that run.

# 20121017-000941 : pipe_exec | cmdout = <debug1: Exit status 139

# 20121017-000941 : *** FATAL ***main::check_status | SEND_TO_CMD (close) [0] failed to execute (139)

# 20121017-000941 : *** FATAL ***main::check_status | (100)

# 20121017-000941 : *** FATAL ***main::check_status |

# 20121017-000941 : *** FATAL ***main::check_status | CMDOUT (close) [1065140] failed to execute (100)

# 20121017-000941 : *** FATAL ***main::check_status | (100)

# 20121017-000941 : *** FATAL ***main::check_status |

This is a non-critical abort ; however, because it delays the ODSR schedule which still is critical I made an entry in Tracker Web.

Gudrun.

Gudrun called to report FAIDSAIG_OD.TDCLIENT_01 had failed.

I verified her findings that it failed during NOTISIR processing. I advised to contact Elden to get help in the appropriate recovery.

David.

10/18/12.

Segmentation fault error. Component is non-critical;however, I informed Elden.

Gudrun.

FAIDSAIG_OD.TDCLIENT - core dump the second night in a row. As with the failure the night before, the ISIR files were downloaded and accumulated successfully. However, this time we received one non-ISIR file CRDL13OP and failed on the CRPG13OP download. I copied the CRDL13OP file from the work directory to where it would have been copied on success (/userfiles/Ufaid/data). Since it was already downloaded, TDCLIENT would not find in when I reset the process flow. However, the logic will think it was left over from a previous run and process it as expected. Next, I changed the "Run ISR files" prompt in backlog from "BOTH" to "NOTISR" to pick up just the non-ISIR files.

* I put the next component in the process flow on hold

* I reset the TDCLIENT

* When it finished sucessfully and the log looked good, I released the hold on the next component

It looks like we didn't receive the CRPG13OP successfully in the first run and it wasn't found to download in the second run, so Financial Aid may need to research and request the file again if needed.

The default is 'BOTH' which runs ISIR-s, then non-ISIR-s. Since it failed after ISIR-s, I changed 'BOTH' to 'NOTISR' to just pick up the non-ISIR-s. However, since we are running TDCLIENT from Quartz (as of a few months ago), we also have to manually copy any successfully downloaded files from the working _proxy_ directory back to the original directory (/userfiles/Ufaid/data/ in this case) before we reset with the 'NOTISR' option.

Here is the modlog for the change to the script:

#* 09/09/2011--GK--T07365--Pass in prompts for ftpfrom/ftpto_user *

#* Allow for easy resetting of runtype RECEIVE *

#* by specifying which files to process to var *

#* run_isr_files:ISR, NOTISR or BOTH.Dft:BOTH *

For now, please contact me for aborted TDCLIENT jobs

Elden.

Aborted Module Name: FAIDTRAK_OD.GLBDATA_21

Date: Day: Time: Resolution:

10/17/12 Wed 06:57 Restarted by Steve.

Error log and follow up comments:

ORA-03135: connection lost contact

Process ID: 0

Session ID: 0 Serial number: 0

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon> ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus Now turn on set -x for debug purposes

+ [ -f login.11174695 ]

+ echo Could not log in to SQL*Plus.

Could not log in to SQL*Plus.

+ echo Exiting with error (return code = 5).

Exiting with error (return code = 5).

+ exit 5

+ err=5

The last GLBADATA_21 did not start processing. Current iteration count for glbdata loop is at 20.

Resetting the GLBDATA-LOOP component will restart processing from where we left off.

Gudrun.

Aborted Module Name: FAIDALEX_OD.SSH_SFTP_01

Date: Day: Time: Resolution:

10/18/12 Wed 18:02 Restarted by Elden.

04/10/13 Wed 18:01 See note from Gudrun.

Error log and follow up comments:

10/18/12.

FAIDALEX_OD.SSH_SFTP_01 -- while I was working on the FAIDSAIG aborted job, I also noticed that this FAIDALEX process flow also aborted. I researched this and believe the file was transferred ok; it looks like ELM processed it so fast that we couldn't find it when we tried to list it.

I found the file in the transferred folder and it matches the file we sent.

* Since the file was already uploaded, I decided to confirm connectivity and finish the job successfully, so I changed the prompts to download the copy from the remote transferred directory.

* I reset the job and it finished successfully.

FOLLOW UP:

* While testing, sometimes the SSH identity file was failing and instead was trying to use password authentication. We need to research this a bit.

* We probably want to add the "nofatal_list_after_put option" to the job

Elden.

04/10/13.

Wed Apr 10 18:05:45 MDT 2013 Page 1

Check Backlog for ABORTED jobs (so_status 202)

Job Chain Id Start Date Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

----------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

FAIDALEX_OD.SSH_SFTP_01 10256259 04-10-2013 18:01:24 MDT 202 ABORTED 2102.24 248 12

FYI - just looked at the output - a thought

We may need check for this abort if they received the file. It appears sftp worked to connect but host failed to communicate back?

Gudrun.

Aborted Module Name: FAIDSAIG_OD.TDCLIENT_01

Date: Day: Time: Resolution:

10/19/12 Fri 00:09 See followup from Elden below.

Error log and follow up comments:

10/19/2012 01:41 EFLICK

FAIDSAIG_OD.TDCLIENT_01 - Researched - copied file - reset - failed again - researching more - will give additional details in later note

10/19/2012 02:54 EFLICK

FAIDSAIG_OD.TDCLIENT_01 - successful

1st failure / core dump:

Indications are that we were processing 3 files for message class CRDL13OP - one new and 2 resends from previous night

aborts:

* looks like batch 2012-10-18T07:46:15.422369043 downloaded to CRDL13OP ok

* looks like batch 2012-10-16T08:55:19.502369043 failed causing core dump

* looks like batch 2012-10-16T08:47:25.942369043 also not downloaded

+ I copied the CRDL13OP from working _proxy_

to /userfiles/Ufaid/data

+ changed the "Run ISR files" prompt to "NOTISR"

+ reset job

2nd abort with another core dump:

* looks like CRDL13OP.002 batch 2012-10-

16T08:47:25.942369043 downloaded ok (file

20121016A00259467416)

* looks like aborted in CRECMYOP for batch 2012-10-

18T04:03:58.3000000001 (file 20121018A00259738479)

+ I copied the new CRDL13OP from the _proxy_ temp dir

to /userfiles/Ufaid/data

+ confirmed the prompt was "NOTISR"

+ reset job

FOLLOW UP:

* Fin Aid probably didn't get one of the CRDL13OP batches that failed before (batch 2012-10-16T08:55:19.502369043)

* Fin Aid didn't get the CRECMYOP (batch 2012-10-

18T04:03:58.3000000001)

* Evidence is mounting that we have data file corruption triggering the core dumps. We need to check with SAIG:

* Is there a known issue and patch available?

* Why did it start this week?

* I have logs and core dumps which we can send to SAIG support if needed

We did reboot Quartz this afternoon, so it should be very 'clean' resource-wise.

Aborted Module Name: AREGDYMP.VPLUS_RCAP-LOOP_01

Date: Day: Time: Resolution:

10/28/12 Sun 22:16 Restarted by Steve.

Error log and follow up comments:

+ egrep ABORTED|CRITFAIL|C-Error

9231188.00 BATCH AREGDYMP.VPLUS_RCAPT10/28 22:16 00:00:02 ABORTED AREGDYMP_MATH_PLACEMENT_LOAD

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

I followed Elden's approach from when this aborted with the same error on 10/14 -- checked if the file that RCAPTURE was trying to capture to Vista Plus was in Vista Plus -- it wasn't. I also checked the driver for the RCAP-LOOP and found it just had the one entry, so I reset the job. It finished and I confirmed the report is in Vista Plus.

Steve.

Aborted Module Name: KFSXAPPO.KFSX_JAVA_01

Date: Day: Time: Resolution:

10/29/12 Mon 20:07 Follow up below.

Error log and follow up comments:

2012-10-29 20:08:18,272 [RMI TCP Connection(150)-129.82.127.238] INFO org.kuali.kfs.module.purap.document.service

.impl.PurchaseOrderServiceImpl :: autoCloseFullyDisencumberedOrders() PO ID 352881 with total 845.03 will be closed

2012-10-29 20:08:18,668 [RMI TCP Connection(150)-129.82.127.238] ERROR org.kuali.rice.kns.util.ObjectUtils :: erro

r getting property value for class edu.csu.kfs.module.purap.document.PurchaseOrderDocument.checkPostingYearForCop

y Unknown property 'checkPostingYearForCopy'

The receiving is messed up on this PO. John Swaro cannot manually close either.

This one is going to need a SQL close.

PO# 352881.

Then just let nightly batch run tonight and the job should go.

Theresa.

Hi Dermot

We will need to sql close this po. Please prepare the task. Use the update statement you asked about.

We will need a dba.

Josh.

Aborted Module Name: ODSRAROS.ODSRS003_01

Date: Day: Time: Resolution:

11/01/12 Thu 00:12 Restarted by Dermot.

Error log and follow up comments:

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUT_AR_SMR_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 21

I am manually running the mapping now.

Is there anything dependent on this finishing successfully?

Mark. P.

Just the ODSRPROD_REFRESH_ODSPROD, all the other Refreshes have completed.

Dermot.

Joleen stopped by and said the same thing (more or less), but I need to know more detail of what ODSRPROD_REFRESH_ODSPROD is.

What components does it contain?

Mark. P.

These are the components that are left to run in ODSRPROD:

APMX FOLLOW UP = Generate various followup reports based on files present in /ais01/dat/misc/followup

SEND_MAIL = An email to update us on all the Refreshes

ODSRS004_02 = Log ODS Refresh Begin/End Times

CHAIN_FINISH = Chain Finish Tasks - cleanup/backup files, send email, etc.

Joleen.

It completed successfully in 1:48:57, go ahead and release the remaining jobs/components.

BEGIN

csug_run_owb_task('OWBREP', 'ODS_CSUBAN_LOCATION', 'PLSQL', 'LOAD_CSUT_AR_SMR_FRZ');

END;

PL/SQL procedure successfully completed

SQL>

SELECT count(*) FROM CSUT_AR_SMR; --149,192 records

SELECT count(*) FROM CSUT_AR_SMR_FRZ; --149,192 records

And thank you Joleen for the additional details on what was left to run.

Mark. P.

Aborted Module Name: AREGHRTM_FA.AREGS415_01

Date: Day: Time: Resolution:

11/05/12 Mon 00:01 Restarted by Joleen.

Error log and follow up comments:

FA had this comment:

No more data to read from socket

I checked for conditions, I restarted and it finished.

Joleen.

Aborted Module Name: AREGHRTM_SP.AREGS415_01

Date: Day: Time: Resolution:

11/05/12 Mon 00:01 Restarted by Joleen.

12/31/12 Mon 00:04 Restarted by Dermot.

01/07/13 Mon 00:02 Restarted by Joleen.

Error log and follow up comments:

11/05/12.

SP had this comment:

Closed Connection

I checked for conditions, I restarted and it finished.

Joleen.

12/31/12 .

Rec'd page:

"LAUNCH ERROR AREGHRTM_SP.AREGS415_01 9607078.01".

Restarted AREGHRTM_SP.AREGS415_01 /

AREGHRTM_SECTION_ENROLLMENT which has now finished.

Dermot.

01/07/13.

I was paged with the following message:

“LAUNCH ERROR AREGHRTM_SP.AREGS415_01 9646219.01”

There is no output file and no conditions. I restarted the job and it has finished.

Joleen.

Aborted Module Name: AREGTTRN.RWCLIENT_01

Date: Day: Time: Resolution:

11/07/12 Wed 06:22 Restarted by Joleen.

07/26/13 Fri 16:37 Restarted by Joleen.

Error log and follow up comments:

11/07/12.

<<errtrap_ssh.6>> print *** \n*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED

+ /ais02/log/AREGTTRN.RWCLIENT_01.9291261.9291275.00.2012_11_07_0622.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

+ cat /ais02/log/AREGTTRN.RWCLIENT_01.9291261.SEND_MAIL_ERR.DAT

REP-0177: Error while running in remote server Engine rwEng-0 crashed, job Id: 276171

Mark P. re-booted and we restarted the component and it completed.

Joleen.

07/26/13.

The RWCLIENT output said it couldn't establish a connection to Sneffels. When I was investigating my AppMan screen went black. When I was able to log back in to Appman I restarted the RWCLIENT and I restarted the OSYS system related process flow that was in DB Error. The RWCLIENT failed again with the same error of not being able to establish a connection. I didn't see any DBA's around. I caught Rich as he was leaving and he looked to see if Sneffels was up. Everything looked OK with Sneffels. We restarted RWCLIENT and this time it finished running.

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Friday, July 26, 2013 4:39 PM

To: IS DL: Alert APMX

Cc: gudrun.kokoszka@gmail.com

Subject: AWPROD APMXCHKS Abort Job Backlog Warning

Fri Jul 26 16:37:26 MDT 2013 Page 1

Check Backlog for ABORTED jobs (so_status 202)

Job Chain Id Start Date Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

-------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

AREGTTRN.RWCLIENT_01 11060761 07-26-2013 16:24:49 MDT 202 ABORTED 7938.56 749 9

Aborted Module Name: AREGDYDL.FTPS_CURL_01

Date: Day: Time: Resolution:

11/09/12 Fri 20:04 Restarted by Elden.

Error log and follow up comments:

# > > USER $PUCSU

# > < 331 Send password please.

# > > PASS **********

# > < 530 PASS command failed

# > * Access denied: 530

I updated the old password with the temp password, then ran the password update, then reset FTPS_CURL_01. Please see the news file for additional information and let me know if you have questions.

Elden.

11/11/2012 02:56 EFLICK

Since we're going to have some maintenance in the morning and would like the schedule as clean as possible, I decided to look into AREGDYDL from Robin's ABORT note.

Since we've been having problems getting the password changed on the remote system and Jerry Becker contacted DOR about this, I suspected they must have reset the password.

I found an email confirming this from Jerry on Friday after I left.

I requested the AREGSPWD_DL_CHG_PASSWORD process flow with a hold, deleted the SEND_MAIL_01 and WAIT_FOR_RLSE since we already had these completed on Friday. Then I reset this process flow -- it finished successfully.

Then I reset AREGDYDL.FTPS_CURL_01, which finished successfully.

Aborted Module Name: HRMSKFSA.HRMS_SPAWN_OUT_01

Date: Day: Time: Resolution:

11/12/12 Mon 17:45 Deleted by Robin.

01/16/13 Wed 17:47 Deleted by Robin.

Error log and follow up comments:

*** COPY SPAWNED CONCURRENT REQUEST OUTPUT TO JOB OUTPUT FILE

+ print *** \n*** OUTPUT FROM SPAWNED CONCURRENT REQUEST 7429647 (PARENT REQUEST 7429624): \n***

+ 1>> /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ cat /oraapps/hrprod/out/o7429647.out

+ 1>> H/ais01/dat/work/prod/RMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ read this_spawned_req

+ cut -f2 -d ?

+ print 7429648?G

+ grep C

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

***

*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION

/ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

LDM SOB 2 Journal Import Execution Report Date: 12-NOV-12 17:41

Concurrent Request ID: 7429647 Page: 2

** Batches listed under "Unbalanced Batches**" have not been imported.

No resolution to this ABORT, it was decided to delete the ABORTED job and let the remaining dependent chain “KFSXCS41” complete.

HRMSKFSA.HRMS_SPAWN_OUT_01 / HRMSKFSA_KFS_ADJUSTMENTS ran successfully on the next nightly cycle.

Dermot.

01/16/13.

Similar error received as received on 11/12/12.

I’ve got Steve Hill looking at this abort. The exact same thing happened in November with this job.

Steve G.

Steve Hill informed Stephen that he will research failed concurrent manager job 7507661. We won’t get output files for this job at this time.

As a result I reset component HRMSKFSA.HRMS_SPAWN_OUT_01 once I made changes to allow for correct output file collection.

Deletion of component would have caused NO VPLUS output being captured. The script exits before moving the incomplete output file collected so far to a tempdir from where

our AppMan VistaPlus component picks it up for VistaPlus upload.

Changes made prior to reset of component HRMSKFSA.HRMS_SPAWN_OUT_01 :

1. Delete string 7507661?G in file /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_FIND_01.spool.lis

2. Moved existing file /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out to /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out.bkp

Reset needs to create a new file. (Otherwise existing file would get appended to resulting in duplicate output)

Gudrun.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

11/16/12 Fri 05:31 Restarted by Dermot.

Error log and follow up comments:

orkflowDocumentServiceImpl :: routeDocument: org.kuali.rice.kew.routeheader.DocumentRouteHeaderValue@54965496[

routeHeaderId=2150801

documentTypeId=320823

docVersion=1

docTitle=Electronic Invoice Reject Document - PO: 356801 Vendor: Fisher Scientific Co

createDate=2012-11-16 05:31:16.0

initiatorWorkflowId=1

routedByUserWorkflowId=1

docRouteStatus=R

routeStatusDate=2012-11-16 05:32:59.639

statusModDate=2012-11-16 05:32:59.639

docRouteLevel=0

routeLevelDate=<null>

approvedDate=<null>

routeLevelDate=<null>

approvedDate=<null>

finalizedDate=<null>

appDocId=<null>

2012-11-16 05:32:59,765 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Saving Invoice Reject for DUNS '150982189'

2012-11-16 05:32:59,765 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.kns.document.DocumentBase :: i

nvoking rules engine on document 2150806

2012-11-16 05:32:59,803 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.kns.document.DocumentBase :: [

document.invoiceOrderReferenceDocumentReferencePayloadIdentifier] error.format.org.kuali.rice.kns.datadictionary.v

alidation.charlevel.AnyCharacterValidationPattern(Invoice Order Reference Document Reference Payload Identifier (I

dentifier))

2012-11-16 05:32:59,803 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.kns.document.DocumentBase :: [

document.invoiceOrderReferenceOrderIdentifier] error.format.org.kuali.rice.kns.datadictionary.validation.charlevel

.AnyCharacterValidationPattern(Invoice Order Reference Order Identifier (Identifier))

2012-11-16 05:33:00,022 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.ksb.messaging.serviceproxies.M

essageSendingTransactionSynchronization :: Message [RouteQueue: , routeQueueId=null, ipNumber=129.82.127.238servic

eNamespace=KFS, serviceName={KFS}SearchableAttributeProcessorService, methodName=indexDocument, queueStatus=R, que

uePriority=30, queueDate=2012-11-16 05:30:34.059] not sent because transaction not committed.

2012-11-16 05:31:23,463 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2150806)

2012-11-16 05:31:23,464 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 150982189_8052239899_10047594884081591.xml has been rejected

I found the offending xml file, renamed it as BAD, removed all the processed files & restarted the ABORTED job & it finished successfully.

Dermot.

Aborted Module Name: ADMSBSLT.LYNX_01

Date: Day: Time: Resolution:

11/16/12 Fri 22:30 See follow up below.

Error log and follow up comments:

STATUS=HTTP/1.1 200 OK

URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

DO NOT restart this Process Flow without the OK from the users. (Kathy Banister, Marcella Vininski)

The user may have reran the aborted LYNX job manually.

If the LYNX module was reran by the user, delete the job from AppMan.

Marcella,

Do you want us to restart this chain?

Vicki.

I’m still trying to figure out what the error was and why it aborted. Please let me look into this a little further before you rerun the program.

Marcella.

Here is some more info from the lynx stdout file:

It looks like maybe it was having connection issues?

<title>The remote server returned an error: (425) Can't open data connection.</title>^M

<style>^M

body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;} ^M

p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}^M

b {font-family:"Verdana";font-weight:bold;color:black;margin-top: -5px}^M

H1 { font-family:"Verdana";font-weight:normal;font-size:18pt;color:red }^M

H2 { font-family:"Verdana";font-weight:normal;font-size:14pt;color:maroon }^M

pre {font-family:"Lucida Console";font-size: .9em}^M

David.

Let’s go ahead and rerun this then if you think it won’t cause any additional problems.

Marcella.

Though the ADMSBSLT.LYNX_01 completed on Appman, Gudrun did notice an error in the log file (highlited below). Can you verify that this worked okay on your end?

URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

David.

The file was created and it look good so not sure why we received this error. I need to do some further research on why we are still getting this error. Do you know if the file layout was recently changed?

Marcella.

Thanks for checking. The error may be from the previous run.

David.

I see the problem there is a duplicate in the file that has two different Slate ID’s. We are having trouble with Slate creating duplicates on their end and are working on resolving this issue with them. In the meantime I’ll clean up his Banner and Slate record so we don’t have this issue with his record, though we may see this again with other duplicates that Slate is creating.

Marcella.

Aborted Module Name: AREGSPWD.FTPS_NEW_PASS_01

Date: Day: Time: Resolution:

11/21/12 Wed 10:25 See follow up below.

08/28/14 Mon 09:35 Restarted by Joleen.

Error log and follow up comments:

11/21/12.

# CMDOUT # [331 Send password please.]

# CMDOUT # [530 PASS command failed]

# CMDOUT # [530 You must first login with USER and PASS.] # CMDOUT # [DONE]

Jerry,

Is this a problem with the new password, or something else?

Steve. G.

I'm guessing the current password didn't work? Let me know if I need to have the state re-set it.

Jerry.

The old password should still work -- do you have a method to manually log in and verify that the old one still works?

If you can't log in with the old password, then we will need to have the State reset it.

Can you ask them why we're having problems with it and see if they require us to keep the same password for a certain amount of time?

Elden.

I don't have direct access to their system so I can't actually log in myself. But I'll certainly ask if we need to keep a password a minimum amount of time.

Jerry.

# CMDOUT # [331 Send password please.]

# CMDOUT # [530 PASS command failed]

# CMDOUT # [530 You must first login with USER and PASS.] # CMDOUT # [DONE]

I see what I did - the new password I created had a repeating character. I'll change it.

Jerry.

07/28/14.

# CMDOUT # [331 Send password please.]

# CMDOUT # [530 PASS command failed]

# CMDOUT # [530 You must first login with USER and PASS.]

Ok - I called the DOR and they've reset the password to what I had submitted via AppMan.

Jerry.

Aborted Module Name: ODSRAROS.ODSRS003_01

Date: Day: Time: Resolution:

12/01/12 Sat 00:04 See follow up below.

01/01/13 Sun 01:57 Restarted by Joleen (see followup).

Error log and follow up comments:

12/01/12.

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUT_AR_SMR_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 21

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

Here is the error I found:

CSUT_AR_SMR_FRZ ORA-08103: object no longer exists 01-DEC-12 CSUBAN

Same error as Nov.1, the table is there and I can select from it. The mapping did run but must have lost its connection before it finished.

Please re-start the job and I will create a Clarity ticket for it on Monday to look into the problem more.

Mark P.

12/01/2012 12:48 MPAQUETT

Error with OWB mapping LOAD_CSUT_AR_SMR_FRZ

ORA-08103: object no longer exists Table does exist in ODS Prod.

I have ask Robin to run the job again and notify me it errors again.

Had same error on Nov. 1 run, will create a Clarity ticket to look into problem further.

01/01/13.

ORA-20000: ERROR running LOAD_CSUT_AR_TERM_DATA_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 18

I called the DBA cell a few times and I left a message a few minutes ago with Mark P.

Joleen.

It looks like a data problem, here is the error I am seeing.

481 CSUT_AR_TERM_DATA_FRZ ORA-01427: single-row subquery returns more than one row 01-JAN-2013 01:01:47 AM

Mark P.

Do you need me to delete this component? It looks like we might need to have the stakeholders to take look?

Joleen.

Aborted Module Name: AGENDYGN.AGENS006_01

Date: Day: Time: Resolution:

12/03/12 Mon 19:01 Restarted by Joleen.

Error log and follow up comments:

11378385 DX 01-JAN-00 Address update needed

Error: ORA-20100: ::Hold from date must be less than or equal to hold to date::

ERROR at line 1:

ORA-20100: ::Hold from date must be less than or equal to hold to date::

ORA-06512: at line 1886

19:01:10 1884 utl_file.fflush(file_handle);

19:01:10 1885 utl_file.fclose(file_handle);

19:01:10 1886 raise; -- reraise the exception

The problem is with the end dates on addresses (12/31/2099)

When the most future Mailing Address has an end date on it, we place a hold (DX) on that person starting on the day after the mailing address ends – in this case 12/31/2099 + 1 day = 01/01/2100,

We should not be putting end dates on addresses 87 years in the future!

PIDM	CSU ID	Last Name	First Name	Middle Name
11378385	829995699	Li	Yinan

Karen,

I see you putting many future dated - end dates on addresses that I believe are unnecessary. Many are on RA addresses.

This causes serious problems when they are placed on MA addresses.

Can someone please remove this to date for this person’s Mailing Address and let us know so that we can restart the schedule?

Vicki.

I removed the To Date on both the mailing and RA addresses for the student below. Will you be able to locate and replace any others I may have done? So sorry, I will leave the address To Date blank going forward!

Karen.

Aborted Module Name: HOUSADDR.AGENS016_01

Date: Day: Time: Resolution:

12/28/12 Fri 17:01 Deleted by Joleen.

12/31/12 Mon 17:01 Resubmitted by Joleen.

08/04/14 Mon 10:34 Resubmitted by Joleen.

Error log and follow up comments:

12/28/12.

Problem Inserting the Address record for:

A,11301960,HA,"C317 Summit Hall",,,Fort Collins,CO,805215244,,20121228,20130517,HOUS,,,,,RMS

error is: ORA-20302: An address cannot be added with the same from_date as an existing address.

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 316

17:01:20 314

17:01:20 315 if v_api_count > 200 then

17:01:20 316 raise_application_error(-20500,'Error count Exceeded 200');

17:01:20 317 end if;

I manually ran this process early in the day Friday to create the new file for the spring semester and I thought I had shut down the automated process. Instead, it looks like the automated process also ran and created a duplicate file.

The combined files were sent over to kebler and it looks like they choked and died. I'll re-create the file and send it over tonight.

Greg Fend.

I have deleted the job from Friday. Tonight's job will pick up your re-created file.

Joleen.

12/31/12.

FYI - I bet the address job aborted again on Monday night because the old address file was still on kebler. I've deleted the old file and uploaded the correct one.

Is it possible to re-run the job to load these addresses?

Greg Fend.

I submitted HOUSADDR_ASSIGNMENT_EXPORT and it has finished running.

Joleen.

08/04/14.

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 316

I have attached the utl file which shows the erorrs

Thanks Joleen. It looks like the file may not be formatted correctly.

I'll take a look and see what I can discover.

Greg.

Aborted Module Name: FAIDEPLS_OD.LYNX_02

Date: Day: Time: Resolution:

01/04/13 Fri 10:06 Restarted by Joleen.

Error log and follow up comments:

The file you sent me said that the error was here:

pidm = (decimal)rdrGetPop[“pidm”];

fund_code = (string)rdrGetPop[“fund_code”];

offer_amt = (decimal)rdrGetPop[“offer_amt”];

but that can’t be because that’s not the code in the file (I removed the “(decimal)” cast text). The code in the file is this:

pidm = rdrGetPop["pidm"].ToString(); //prod

fund_code = (string)rdrGetPop["fund_code"];

offer_amt = (decimal)rdrGetPop["offer_amt"];

This means that either the lynx procedure doesn’t point at WSNET, WSNETDEV, or its storing the page in some sort of cache and not requesting new page content from the server…

Zach Garno.

Here is the URL it is using:

http://wsnet.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay=1213&treq=OD Correct?

I believe the LYNX step runs the latest code each time. Is this not true?

This was re-started at 07:49 this morning. Has your code changed since then? Do you want us to try again?

David.

Please try to re-run the job.

Zach.

The first LYNX step completed successfully. The second LYNX step has now failed. I have attached the LYNX standard output.

Just an FYI, the standard output file appends so the latest info will be at the bottom of the file if we have to rerun.

David.

Aborted Module Name: ODSRAGEN.ODSRS001_01

Date: Day: Time: Resolution:

01/16/13 Wed 03:00 Restarted by Dermot but ABORTED again @ 8am, see below.

Error log and follow up comments:

ODSRAGEN.ODSRS001_01.9701248.9701250.00.2013_01_16_0004.jobo

ut 100

no output from ODSRAGEN.ODSRS001_01

+ err=100

I checked this CRITFAIL & did not find any similar errors in our ABORT logs.

I called and left a message on the DBA cell.

AS per Craig, I restarted ODSRAGEN.ODSRS001_01 @ 03:44.

Dermot.

The mapping DELETE_MST_GENERAL_STUDENT failed last night with the following error.

Error Msg: ORA-01555: snapshot too old: rollback segment number 19 with name "_SYSSMU19_2294121418$" too small

Which caused the UPDATE_MST_GENRL_STDNT_STEP_2 to fail with a unique constraint error because the delete was did not complete. This is the same error we ran into last week. I will work on adjusting the rollback segment to help eliminate this problem in the future.

Also note there appears to be a larger than normal download of data hitting the ODS today, so expect things to run longer than average.

Mark P.

It looks like the UPDATE_MST_GENRL_STDNT_STEP_2 has stalled out. It has been running since 3:26 this morning and usually runs in 15 mins (give or take).

I did work with Mark B to get the rollback segment adjusted on ODS Prod.

I am going to kill the process on the Oracle side and would like to have it restarted via Appman.

Mark P.

The UPDATE_MST_GENRL_STDNT_STEP_2 mapping completed and things are still moving…

Mark P.

Aborted Module Name: FAIDSAIG_EV.TDCLIENT_01

Date: Day: Time: Resolution:

02/02/13 Sat 02:06 See follow up below.

Error log and follow up comments:

I verified that component aborted while trying to receive ISIR files. No files were received !

# 20130202-000601 : pipe_exec | cmdout = <WARNING: Failed to connect to server>

# 20130202-000601 : pipe_exec | cmdout = <Error connecting to network SAIGPORTAL>

# 20130202-000601 : pipe_exec | cmdout = <(-1) FTP connection attempt failed.>

# 20130202-000601 : pipe_exec | cmdout = <Connection refused

I attempted reset around 2am again but same result – connection failed again. Should have give them time to troubleshoot.

Server on their side is still having issues or password incorrect ? It needs to be followed up during the daytime. At the moment I don’t know whom to call on their side. Also did check that we did not change the password yesterday. Negative. FAIDSPWD did not run yesterday.

Critical AppMan job component FAIDSAIG_EV.TDCLIENT_01 failed last night. We are investigating. Connection attempts to SAIGPORTAL are being refused.

Gudrun.

Password issue? Did FAIDSAIG_OD run successful? SAIGPORTAL may be down??

Phil.

To confirm 100% that SAIGPORTAL is down I am contacting Rich. I need access to quartz. An ftp connect attempt from the command line should support conclusion drawn that SAIGPORTAL is down.

I checked with Rich and information received supports that SAIGPORTAL is down. FTP attempt from quartz to server SAIGMAILBOX.ED.GOV times out. Any ping command as well. However, latter might get blocked. Unless server access is restored production FAID schedule started yesterday won’t complete this weekend. I think Candy Chapman needs to be contacted; however, I believe you would prefer doing this. Let me know if I can help with anything else.

Gudrun.

Candy,

I just would like to inform you that we won’t be able to complete the FAID schedule this weekend

UNLESS access to SAIGPORTAL server SAIGMAILBOX.ED.GOV gets re established.

Last night FAIDSAIG_EV.TDCLIENT_01 aborted with ftp error – connection refused. Several reset attempts after midnight failed since then with the same error. The AROS schedule may not complete this weekend due to a delay in the FINAID schedule. Access to a state server has not been restored at this point in time. The server refuses connections. I sent out email alerts also to CLMS and AROS IS DL email lists to alert them of their schedule being delayed. Emailed Candy Chapman. I don’t have her phone number. But do think Phil needs to call. He has been contacted and responded earlier to an email.

Gudrun.

I found an old Email with Candy Chapman's number and then called Vicki to verify that it was okay to call her. Vicki said to do so. I called Candy and she said that the SAIG server is down because of a planned outage. Candy was going to cancel the affected FAID jobs but had forgotten. She is also on the road now, but will send an Email of jobs we can turn the flags off so they don't run. David.

I apologize. Karma thought she had canceled the affected jobs, but she thought we just couldn't SEND stuff to SAIG. She hadn't thought about picking stuff up from them. So we think we might as well cancel the entire schedule. She thinks they will be down until sometime tomorrow. We'll figure it out on Monday.

Candy.

I called Josh. I was not sure and he needs input regarding if it is ok to release CLMS process flow CLMSDISB_FINAID_DISBURSEMENTS. Dependent AROS process flows were released earlier this morning after the FAID schedule got deleted. Remaining backlog FAID schedule process flows got deleted except for FAIDAM99. AROS schedule completed as scheduled – no deletions. Rob did ok for remaining AROS process flows to run despite FAID schedule deletion. CLMS schedule is STILL on hold. Waiting on ok from Phil that CLMSDISB_FINAID_DISBURSEMENTS can be released.

Gudrun.

Phil called, I missed his call but called him back. He said to delete CLMSDISB but allow CLMSDATA to run. I did this and the CLMS schedule is complete,

David.

Aborted Module Name: FAIDTRAK_OD.GLBDATA-LOOP_02

Date: Day: Time: Resolution:

02/14/13 Thu 00:51 See follow up below.

Error log and follow up comments:

02/14/13.

230 Error Timed out waiting for response on client pipe 240.

+ iteration_done=yes

+ [[ yes = no ]]

+ spawned_module_name=GLBDATA

+ . SRC_APMX_STATUS_FOR_SPAWNED.KSH

+ set -x

+ awexe jh

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 9880772

+ print FAIDTRAK_OD_DT_HOLD

+ 1>> /ais01/dat/work/prod/FAIDTRAK_OD.GLBDATA-LOOP_02.selections_done

+ awexe upd_var_value subvar=#glbdata_iterations_9876682 var_value=21 flag=Y

A UC4 ticket was created for prod error: “230 Error Timed out waiting for response on client pipe 240.” UC4 Ticket #211836

Subject: 230 Error Timed out waiting for response on client pipe 240.

Priority Level: 2 (1highest – 4 lowest)

230 Error Timed out waiting for response on client pipe 240.
Last week we have observed the above *Time out* error in our production instance twice. Our test instance also encountered the same issue.

A general slowdown is being observed in both instances.
Could you suggest a troubleshooting strategy and possible tuning parameters we could point out to our system staff ?
Gudrun.

Aborted Module Name: HRMSS230.HRMSS230_01

Date: Day: Time: Resolution:

02/16/13 Sat 07:21 See follow up below.

03/16/13 Sat 06:50 See note from Gudrun below.

Error log and follow up comments:

HRMSS230.HRMSS230_01 is in EMPTY_EXTR status:

Abort status was set by a check file condition. The condition is looking for {#utl_file1_{chain_id}}. I do see a utl file1 for HRMSS230.HRMSS230_01 out in /orautl/hrprod but it doesn?t have the chain id. I’m not sure where the job is looking for the utl file1. Someone smarter than me can figure this one out.

I checked this out -- as Joleen said, the utl file was indeed present, so I tried restarting the aborted job and it finished successfully. Maybe just some weird timing issue?

Steve. G.

02/16/2013 10:37 JWEARNE

I logged in earlier today and noticed 2 aborts.

FAIDTRAK_OD.LYNX_01 and HRMSS230.HRMSS230_01. I emailed Candy for the FAID abort. She came in and fixed their Webpage and had me restart. Their schedule is progressing now. She would like to be emailed if there are any more LYNX aborts.

03/16/13.

aborted with status EMPTY_EXTR. CHECK_FILE AFTER condition failed to detect file HRMSS230.HRMSS230_01.utl_file in /orautl/hrprod.

Workaround: PL/SQL completed. No reset. Manually complete the last two conditions before deleting component.

This abort got resolved. Before deleting the component I copied the generated file to

/ais01/ftp/to/user/HRMSS230.VSP.DAT

/userfiles/Uhrben/data/HRMSS230.VSP.DAT.

Process flow HRMSS230 completed shortly afterwards.

Gudrun.

Aborted Module Name: AREGTTRN.SSH_SFTP_01

Date: Day: Time: Resolution:

02/18/13 Mon 00:20 Restarted by Joleen.

05/08/13 Wed 03:20 Restarted by David.

09/10/13 Tue 11:10 Restarted by David.

Error log and follow up comments:

02/18/13.

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > ssh: connect to host iwantmytranscript.com port 22: A remote host did not respond within the timeout period.

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

I saw several references to this type of error in the abort log document, and in most cases the job was just restarted. I tried this and it has finished successfully.

Joleen.

05/08/13.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found # > Connection closed # > (255) #==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

I re-started after checking that no files had been received.

David.

09/10/13.

Something not ok with AREGTTRN.SSH_SFTP_01. Its driver is empty.

Gudrun.

I killed this process. It will retry at 11:20

AREGTTRN.SSH_SFTP_01 appears to be working now.

David.

Aborted Module Name: FAIDTKNT_OD.LYNX_01

Date: Day: Time: Resolution:

02/28/13 Thu 03:00 Restarted by Joleen.

03/01/13 Fri 03:02 Restarted by Joleen.

Error log and follow up comments:

FAIDTKNT_OD.LYNX_01 aborted. I saw your email about a new URL. I replaced the URL with the new one below and restarted. It aborted again. Below is the standard output from the second time it aborted. The message from the first abort was: You have requested a page that either never existed or no longer exists on this web server

http://wsnet.colostate.edu/cwis231/autorun/JobChain/parent_tracking_notification.aspx?ay={#2}

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>

<head>

<title>Untitled</title>

</head>

<body>

You have requested a page that either never existed or no longer exists on this web server.<BR><BR>

The web page you are visiting is part of a larger site maintained by a department

at <a href="http://www.colostate.edu">Colorado State University</a>.<BR><BR>

If you came to this page via a "bookmark", this page may have been moved. You may be able to rectify this

I'm sorry. I thought the original URL would still work until you could make the change. I'll fix it as soon as I get in - just a few minutes.

Candy.

03/01/13.

<head>

<title>ORA-20100: ::Cannot create, record already exists::<br>ORA-06512: at "BANINST1.CSUG_API_GLBEXTR", line 287<br>ORA-06512: at line 1</title>

<style>

body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;}

p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}

The error means there’s already a record in the GLBEXTR for the ID, so it won’t insert one twice. I’m not sure why, but let me take a look and see if I can tell which is the duplicate record.

Candy.

Has anyone heard anything about what to do with aborted LYNX job FAIDTKNT_OD.LYNX_01 ?

Gudrun.

I’m working on it. / Candy fixed the LYNX for us. (Thank you, Candy! You made my day!) FAIDTKNT_TRACK_NOTIFICATION has finished running.

Joleen.

Aborted Module Name: ODSAROS.ODSRS003_01

Date: Day: Time: Resolution:

03/01/13 Fri 11:00 Killed & Restarted by Gudrun.

Error log and follow up comments:

We have a refresh that has been running 11 ½ hours. It is: REFRESH_FV_AR. Would it be possible to check it out and make sure that everything is OK and that it is not hung up?

Joleen.

It is the LOAD_CSUT_AR_SMR_FRZ, which has caused problems before. It either runs in under 2 hours or hangs.

Please kill and restart, I will try to keep an eye on it.

LOAD_CSUT_AR_SMR_FRZ 00: 00: 00 RUNNING 01-MAR-2013 01:03:51 01-MAR-2013 01:03:51

Mark. P.

I killed and restarted process flow component ODSAROS.ODSRS003_01.

Others out for lunch.

What a Friday ?! … --- still one thing out there to tackle --- ODSRAROS

Do you guys prepare for running overtime for AROSAM99 ?

Gudrun.

I just spoke to Joleen about ODSRAROS. If it is still running this evening, she said she would call the DBA cell and have them look at it again. We can also force AROSAM99 to finish without ODSRAROS being done so that we can get tonight’s AROS schedule going, which is probably what we will do.

Steve.

What are your thoughts about the LOAD_CSUT_AR_SMR_FRZ? We restarted it 3 hours ago and it is still running. We’re trying to formulate a plan for tonight’s schedule.

Joleen.

Mark B. called me about the two csuban jobs running and causing problems. One of the jobs had been running for over 14 hours, which probably didn’t die after we re-started the job this morning (nice catch Mark B.)

I killed it and hope this allows the current one to finish.

Mark P.

Mark P called. The refresh is processing and is not hung. He will check again in a couple hours.

Joleen.

The AR FRZ job is done!

01-Mar-2013 22:37:35 ODSRAROS

ODSRAROS_REFRESH_AROS_ODSPROD

Glad to have that obstacle out of the way for the Banner Agent upgrade on Sunday!

Joleen.

Aborted Module Name: ADMSSRLD_DY.SQLSURLOAD-LOOP_01

Date: Day: Time: Resolution:

03/05/13 Tue 22:24 See note from David below.

Error log and follow up comments:

+ print ABRD_BRSPI

+ 1>> /ais01/dat/work/prod/ADMSSRLD_DY.SQLSURLOAD-LOOP_01.sqlsurload_done

+ awexe upd_var_value subvar=#sqlsurload_iterations_10003798 var_value=1 flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

I fixed the variable size. I changed #sqlsurload_iterations_{chain_id} to #sqlsurload_iter_{chain_id} in the SQLSURLOAD-LOOP.KSH

David.

Aborted Module Name: ODSRFAMS_REFRESH_FAMS_ODSPROD

Date: Day: Time: Resolution:

03/12/13 Tue 23:22 Restarted by Joleen.

Error log and follow up comments:

ORA-20000: ERROR running LOAD_FAMIS_DEPT_SPACE_FUNC

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 288

23:22:54 287 WHEN 'REFRESH_FAMIS_CSU' THEN

23:22:54 288 csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_DEPT_SPACE_FUNC');

23:22:54 289 csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_EMP_SPACE_DEPT');

23:22:54 290 csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'L_SPACE_BUILDING_T');

23:22:54 291 WHEN 'REFRESH_HR_GRAD_ASST_CSU' THEN

23:22:54 292 dbms_mview.refresh('CSUBAN.CSUH_GRAD_ASST_APPROVALS_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

23:22:54 293 WHEN 'REFRESH_SALX_CSU' THEN

23:22:54 294 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_APPT_TYPE_T');

23:22:54 295 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_CUR_FY_GP_T');

23:22:54 296 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_CUR_FY_JA_MTH_T');

23:22:54 297 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_EMPLOYEES_T');

23:22:54 298 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_EMPL_TYPE_T');

23:22:54 299 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_HRDEPT_T');

23:22:54 300 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_JOB_CLASS_T');

23:22:54 301 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_REF_CODES_T');

23:22:54 302 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_SECURITY_T');

23:22:54 303 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_UNIT_CODE_T');

23:22:54 304 csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_USERS_T');

23:22:54 305 ELSE

23:22:54 306 DBMS_OUTPUT.PUT_LINE ('INVALID REFRESH NAME');

23:22:54 307 END CASE;

23:22:54 308

23:22:54 309 csug_ods_refresh.log_end_time(mat_view_type);

23:22:54 310

23:22:54 311 end;

There was a locked account on FMSPROD that caused the job to fail.

Error:

"ORA-28000: the account is locked

ORA-02063: preceding line from FMSPROD@FAMIS_LINK"

The account has been unlocked and I asked David to restart the job.

Mark. P.

Aborted Module Name: KFSXTXEF.CHAIN_FINISH_01

Date: Day: Time: Resolution:

03/15/13 Fri 17:01 Deleted by Dermot & re-ran successfully on the 21^st March.

Error log and follow up comments:

+ 1>> /ais01/dat/work/prod/KFSXTXEF.CHAIN_FINISH_01_10072226_jobstat

+ cat /ais01/dat/work/prod/KFSXTXEF.CHAIN_FINISH_01_10072226_jobstat

*** SEARCH OF LAST JOB (10072228.00) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

A file or directory in the path name does not exist.

Componet KFSXTXEF.CHAIN_FINISH_01 aborted due to a runhost log error in file /ais01/joblog/runhost_10072226_AWPROD.log. File ORIG.29A52*txt could not be moved to bkp.

Cause:

It seems the file was not picked up correctly in the first place because BEFORE SEND MAIL check file condition checks for the file in a wrong directory. Cause for the faulty check is subvar /{#apmx_last_yyyy}/ which has value 2012 when in fact it should probably be 2013 since the file can be found in dir {#kfsx_staging_{chain_id}}/tax/2013

{#kfsx_staging_{chain_id}}/tax/{#apmx_last_yyyy}/ORIG.29A52*txt.

FYI

BOTH SEND_MAIL_01 and SEND_MAIL_02 need to be re-run in entirety I assume to complete process flow correctly UNLESS the KFSX_JAVA component should have been run

For tax year 2012. (This needs to be verified)

KFSXTXEF.CHAIN_FINISH – a subvar has value 2012 rather than 2013 with the end result that CHAIN_FINISH did abort.

Josh and Dermot have been informed about it. Troubleshooted it on Saturday but decided not to fix. If 2013 should have been the year used for processing

Then the two SEND_MAIL have to looked at. The evaluation done there assumed I believe also 2012.

Gudrun.

Here is the Parameter that is set to 2013.

Mike.

Here is the SQL that can be used to extract that data. We need to involve BFS on this change.

select txt

from krns_parm_t

where nmspc_cd = 'KUALI-TAX'

and parm_dtl_typ_cd = 'PayeeMasterExtractStep'

and parm_nm = '1099_REPORTING_PERIOD'

Josh.

Aborted Module Name: FAIDSAIG_EV.TDCLIENT_01

Date: Day: Time: Resolution:

03/16/13 Sat 00:04 Restarted by Gudrun.

08/16/13 Fri 00:05 Restarted by Joleen.

Error log and follow up comments:

03/16/13.

Checked log. Clean abort. ftp failed right from the start. Logged on to quartz to test connection to saigportal server. Simple ping as well as ftp test to server failed.

Checking again in the early morning hours plus contact FAID team ...

TDCLIENT aborted in AWPROD. Find attached the log file for the aborted TDCLIENT job. SAIGPORTAL server is not responding to ftp connection attempts when logged on to quartz as jobprd. At this point no resolution of the issue is possible. Server simply is not responding to ftp.

AWPROD FAIDSAIG_EV.TDCLIENT_01 aborted in the early morning hours due to a ftp connection failure. Since then repeated attempt to initiate successfully a ftp connection from our quartz server to SAIGMAILBOX.ED.GOV have failed. Resolution will have to wait until the service is operable again on their side.

Phil responded back. He will contact FAID staff depending on who can be reached. Candy is out on vacation it appears. He will call me back.

Looking at backlog and the number of aborts out there and the chance of all of them being resolved a bit slim by tomorrow morning I propose to postpone SP10 patching until next Sunday March 24. KFS abort already needs to wait until Monday

TDCLIENT abort got resolved. Reset of component was successful this morning.

Gudrun.

08/16/13.

********** Start Communications Session

Connecting to server SAIGPORTAL...

FTP connection attempt failed.

Connection timed out

Error connecting to network SAIGPORTAL

(-1) FTP connection attempt failed.

Connection timed out

Termination started...

Disconnecting...

********** End Communications Session

I know that FAIDSAIG can be tricky. I checked prompts, conditions, the abort log, I read through the process flow documentation. I looked at notes I had taken on errors that were OK to restart. This error fell in that catagory. I restarted FAIDSAIG_EV.TDCLIENT_01.

FAIDSAIG_TDCLIENT_FILE_INPUT has finished running.

Joleen.

Aborted Module Name: FAIDALCT.LYNX_01

Date: Day: Time: Resolution:

03/18/13 Mon 07:07 Restarted by Dawn.

Error log and follow up comments:

The current error page you are seeing can be replaced by a custom error page by modifying the "defaultRedirect" attribute of the application's <customErrors> configuration tag to point to a custom error page URL.<br><br>

I'm out of town today. Usually you can just restart the LYNX step that failed and it'll often run OK. I don't think anything has changed with that one. If you need more help today Zach will try to help.

Candy.

Aborted Module Name: HRMSSAL0.RUNGEN_01

Date: Day: Time: Resolution:

03/19/13 Tue 10:30 See follow up below.

Error log and follow up comments:

HR_6881_HRPROC_ORA_ERR

SQLERRMC ORA-01841: (full) year must be between -4713 and +9999, and not be 0

SQL_NO 1539

TABLE_NAME PER_TIME_PERIODS

APP-PAY-06881: Error ORA-01841: (full) year must be between -4713 and +9999, and not be 0 has occurred in table PER_TIME_PERIODS at location 1539

Cause: an oracle error has occurred. The failure was reported on table PER_TIME_PERIODS at location 1539 with the error text ORA-01841: (full) year must be between -4713 and +9999, and not be 0

Action: Please contact your support representative.

I have no idea what the problem is here, the pay periods are set up through Fiscal 2014 even and encumbrance have been using them all year. Chris said Steve is out, so she is sending this on to Bob.

Vickie S.

You have probably already figured this out but the last time this happened there was a value set tied to the conc program parameter that was date based and it only looked out 60 days. I needed to change it to 120 days. I think this value set just needs to be tweeked.

Steve H.

Aborted Module Name: FAIDALIM.SEND_MAIL_01

Date: Day: Time: Resolution:

07/08/13 Mon 07:13 Restarted by Joleen.

Error log and follow up comments:

cat: 0652-050 Cannot open /ais01/dat/work/prod/FAIDALIM_DRIVER1.DAT.

***

*** END SEARCH OF LAST JOB (10913846.01) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS

The file does not exist. It was supposed to be created by a BEFORE condition on FAIDALIM.SSH_SFTP_01. RUN HOST COMMAND: {#logrunhost}; touch {#workdat}/FAIDALIM_DRIVER1.DAT; touch {#workdat}/FAIDALIM_DRIVER2.DAT.

I’m not sure how to proceed. There is a source file from SCH05FO@ftp.elmproduction.com:/mailbox/COLOSTAT/INBOX/*.DS2.

I’m not sure if information where the information from this source file is and does it need to be contained in {#workdat}/FAIDALIM_DRIVER1.DAT?

FAIDALIM.SSH_SFTP_01 status is empty; I’m assuming I can create an empty {#workdat}/FAIDALIM_DRIVER2.DAT. for that one.

Joleen.

Aborted Module Name: ODSRHRMS.ODSRS002_01

Date: Day: Time: Resolution:

04/04/13 Thu 05:44 Restarted by Robin.

Error log and follow up comments:

**** NOT_FINISHED FEEDBACK - FOLLOW-UP REQUIRED - COMPONENTS NOT FINISHED ****

Chain Name Chain Component Name Status Date/Time

----------------------------- -------------------- ------------ -----------------

ODSRHRMS_REFRESH_HRMS_ODSPROD ODSRHRMS.ODSRS002_01 CANCELLED 04/04/13 00:05:44

ODSRHRMS.ODSRS002_01 CRITFAIL 04/04/13 00:05:44

**** END OF NOT_FINISHED FEEDBACK ****

The error was:

"ORA-08103: object no longer exists - ORA-02063: preceding line from HRPROD@HR_LINK"

The OWB job must have lost its connection.

OK to restart.

Mark. P.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

04/15/13 Mon 05:30 Restarted by Dermot.

Error log and follow up comments:

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.10286358.10286361.00*.

2013-04-15 05:30:35,304 [main] INFO org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: Mon Apr 15 05:30:35 MDT 2013 - Invoking localhost:1099/KFSXAPEI.electronicInvoiceExtractStep.10286358.10286361.00/electronicInvoiceExtractStep

Exception in thread "main" java.lang.NullPointerException

at org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient.main(BatchJobRmiInvokerClient.java:83)

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

2013-04-12 05:33:24,255 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.kns.service.impl.PostProcessor

ServiceImpl :: finished handling route status change from I to R for document 2357872

2013-04-12 05:33:24,261 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.rice.kew.routeheader.service.impl.W

orkflowDocumentServiceImpl :: routeDocument: org.kuali.rice.kew.routeheader.DocumentRouteHeaderValue@3a5c3a5c[

routeHeaderId=2357872

documentTypeId=320823

docVersion=1

docTitle=Electronic Invoice Reject Document - PO: 373878 Vendor: OfficeMax Inc

2013-04-12 05:32:30,170 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2357872)

2013-04-12 05:32:30,171 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 606788404omax1_269486APR1113_22752585499849576.xml has been rejected

2013-04-12 05:32:30,174 [RMI TCP Connection(2)-129.82.127.238] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Processing 606788404omax1_271929APR1113_22752587318510904.xml....

2013-04-12 05:32:30,174 [RMI TCP Connection(2)-129.82.127.238] INFO

Cannot find the rejected :: 606788404omax1_269486APR1113_22752585499849576.xml file!

We also had an ABORT on KFSXCS16.CHAIN_SQL_INIT_01 around the same time with ORA-12541: TNS:no listener error.

I checked for .processed files, there were none, Josh requested that I restart the ABORTED job, I did & it completed successfully.

Dermot.

Aborted Module Name: FAIDPMAN_OD.SEND_MAIL_RW_01

Date: Day: Time: Resolution:

04/22/13 Mon 21:06 Restarted by Joleen.

Error log and follow up comments:

SMTP recipient() command failed:

5.1.1 <cindy.heckle@colostate.edu>... User unknown

error is 255

This one was hard to figure out. When I tried just removing cindy.heckle@colostate.edu from the SEND_MAIL_RW_01 component "Recipient" prompt in Backlog, I got a database query error of "value too large for column". What finally worked was to delete the entire component prompt value in Backlog, save that change, and then go back in and repopulate the prompt minus cindy.heckle@colostate.edu , and restart the component, which finished successfully. Then the same failure occurred with the SEND_MAIL_RW_02 component, and I used the same process to get it to finish successfully. Not sure what caused this in the first place -- cindy.heckle@colostate.edu appears to be a valid email address from what I can see. It looks to me like the email recipient list is derived from file /userfiles/Ufaid/data/FAIDPMAN_OD.RWCLIENT-EMAIL_01.DAT

Steve.

I'm sorry about this. Cindy Heckle no longer works for us. I thought she was still on campus, but maybe she isn't. I'll get her email address out of all our lists and jobs.

Candy.

Aborted Module Name: FAIDSUNT.LYNX_01

Date: Day: Time: Resolution:

04/26/13 Fri 22:01 Restarted by Joleen.

06/15/13 Sat 04:31 See follow up note below.

Error log and follow up comments:

04/26/13.

URL=https://wsnet.colostate.edu/cwis231/autorun/JobChain/SummerAwardEmail.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

Standard output file:

[SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)]

Line 34: OR Hold IS NULL

Line 35: )", new SqlConnection(connectionBanner));

<font color=red>Line 36: command.Connection.Open();

</font>Line 37: SqlDataReader data = command.ExecuteReader();

Line 38: StringBuilder result = new StringBuilder();</pre></code>

[[Win32Exception]: The network path was not found

[SqlException]: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

It aborted again. It is OK on our side if you want to run it manually. I can delete the LYNX module and let the rest of the job finish running. Do you need to run that manually before I let the job finish?

Joleen.

I don’t think we want to delete it, but just comment it out for now and see if it will run tomorrow. Yes I would need to run my page before the job runs.

Zach.

That is a great plan. Let me know when you are done running the page and I will comment out the LYNX and let the rest of the job finish.

Joleen.

Okay Joleen, I ran the page manually and when I did I found a different error. I fixed this error and the page executed normally, so I think it will be okay the next time the job auto runs. You can continue running the job now.

Zach.

06/15/13.

Error: StartIndex cannot be less than zero.

There are no summer award emails to send, so that caused the error. Can you force the job to finish without running the LYNX? Zach will fix it so it knows what to do if the query is empty the next time. Sorry to cause you more work!

Candy.

I bypassed the LYNX. FAIDSUNT_SUMMER_NOTIFICATIONS has finished running.

Joleen.

Aborted Module Name: FAIDSUMR.LYNX_01

Date: Day: Time: Resolution:

05/02/13 Thu 22:21 See follow up below.

09/23/13 Mon 21:16 Restarted by Joleen.

04/16/14 Wed 21:16 Deleted by Joleen.

Error log and follow up comments:

Oracle.DataAccess.Client.OracleCommand.ExecuteReader(Boolean requery, Boolean fillRequest, CommandBehavior behavior) +4168

Oracle.DataAccess.Client.OracleCommand.ExecuteReader() +136

Summer.getBaseCriteria(String pidms, String aidYearCode, String termFall, String termSpring, String termSummer) in e:\USERS\cwis231\wwwroot\autorun\JobChain\Summer.aspx.cs:31

Summer.Page_Load(Object sender, EventArgs e) in e:\USERS\cwis231\wwwroot\autorun\JobChain\Summer.aspx.cs:20

System.Web.UI.Control.LoadRecursive() +71

System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsync

This is okay, I will look at it later to fix the issue, but its basically saying that there are no records to processes. We can ignore this and mark the job as complete.

Zach.

FAIDSUMR_SUMMER_PROCESSING has finished running.

Joleen.

09/23/13.

[OracleException (0x80004005): ORA-12571: TNS:packet writer failure]

Can we try re running these first? This looks like a network hiccup.

Zach.

FAIDSUMR.LYNX_01 was restarted and finished successfully -- I then restarted FAIDTRAK_OD.LYNX_02 and it also finished successfully.

Joleen.

04/16/14.

Object reference not set to an instance of an object.

Zach asked me to delete FAIDSUMR_SUMMER_PROCESSING from tracking for today. It will be back on the schedule tonight.

Joleen.

Look in chain prompts & copy the standatd output file name:

/appworx/out/LYNX_12998547.00.stdout.txt

change the .txt to html & send to Support!

Dermot.

Aborted Module Name: AREGTTRN.AREGS620_01

Date: Day: Time: Resolution:

05/03/13 Fri 08:23 See follow up below.

Error log and follow up comments:

ORA-01843: not a valid month

ORA-06512: at line 130

08:23:36 127 --******************************************************************************

08:23:36 128 --Set the request date to the TOD file order date

08:23:36 129 --******************************************************************************

08:23:36 130 vrequest_date := to_date(n_rec.order_date,'YYYYMMDD');

08:23:36 132 --******************************************************************************

Just took a quick look at AREGS602 which read from SWLTTOD table. The order_date column does look odd

42013050

52013050

32013050

92013050

But I think the file in general is 1 character off. I think that input must be 1 character off at or before Order number. See Status starts with a 3 which should be the ending number of Oder Date (2013/05/03).

Order Number	OrderDate	Status	Status Update	When to Deliver
211508	42013050	3Ready For Processing	2013050309192	6now
211508	52013050	3Ready For Processing	2013050309203	0now
211510	32013050	3Ready For Processing	2013050309255	4now
211517	92013050	3Ready For Processing	2013050309482	3now
211518	92013050	3Ready For Processing	2013050309533	0now
211528	92013050	3Ready For Processing	2013050310145	4now

Not sure where to go from here.

Vicki.

We’re seeing the same thing. It looks like TOD has inserted one extra padding character between the Authentication and Order Number fields in the downloaded text files. I’ll work with Matt on contacting Scrip Safe to get this resolved.

Rob.

Per Rob I deleted the failed AREGFQTR process flow and then re-started from the beginning in order to pick up the corrected files from Escrip-safe. It looks like it is working now and AREGS620 is complete.

David.

It looks like things are working correctly again. The data has been corrected and I can see the failed transcript requests in our tables.

Matt – when you get a chance you may want to check TOD’s portal to make sure we don’t have any missing orders.

Rob.

Aborted Module Name: KFSXCS53.KFSXS055_01

Date: Day: Time: Resolution:

05/03/13 Fri 22:23 See follow up below.

Error log and follow up comments:

old 3: utlpath varchar2(255) := '&utl_path';

new 3: utlpath varchar2(255) := '/orautl/kfsprd';

old 4: outfile1 varchar2(80) := '&utl_file1';

new 4: outfile1 varchar2(80) := 'KFSXCS53.KFSXS055_01.utl_file1';

,krns_nte_t t2

ERROR at line 26:

ORA-06550: line 26, column 8:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 16, column 1:

ORA-06550: line 39, column 11:

PL/SQL: ORA-00942: table or view does not exist

Please make a note that the synonyms have been created for KRNS_NTE_T and KRNS_ATT_T.

This should never impact us, but I want to make sure that we have it noted in the chain.

The only time this would impact us is if we start running batches for Rice, KC or KPME attachments.

Josh.

I had to create three - KRNS_ATT_T, KRNS_NTE_T, and KRNS_DOC_HDR_T. In the consolidation we didn't create synonyms for objects that exist in both KFS and KR. Our approach was to deal with issues on a case-by-case basis. In this case I created the synonyms against the KFSUSER tables vs. the KRUSER tables.

Shawn.

That is correct, we were planning on these issues coming up. Creating the synonyms was our plan. I would still like Dermot to document this, if another chain or job is created the default pointer is going to be the KFS owned table. Just want to make sure that stays on everyone's mind.

Josh.

Aborted Module Name: FAIDLORC_OD.LYNX_03

Date: Day: Time: Resolution:

05/06/13 Mon 15:20 See follow up below.

Error log and follow up comments:

Standard output:

Disbursement Amount Discrepency

</title></head>

<body>

Production page that runs in FAIDLORC.

Job rprlorc got recently tested and Banner agent converted by David and Joleen.

RPRLORC output looks ok to me. The Banner prompt values passed in were picked up properly according to Summary in .lis .

It seems they have a processing issue at their end.

Gudrun.

I restarted FAIDLORC_OD.LYNX_03 and it worked this time. Yippee!

Joleen.

Aborted Module Name: ODSRAGEN.ODSRS001_01

Date: Day: Time: Resolution:

05/07/13 Tue 00:04 Restarted by Dermot.

Error log and follow up comments:

old 6: csug_ods_refresh.log_begin_time('&REFRESH_APP');

new 6: csug_ods_refresh.log_begin_time('REFRESH_STUDENT');

old 8: ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new 8: ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_STUDENT', NULL, '');

old 14: csug_ods_refresh.log_end_time('&REFRESH_APP');

new 14: csug_ods_refresh.log_end_time('REFRESH_STUDENT');

begin

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

00:04:46 5 begin

00:04:46 6 csug_ods_refresh.log_begin_time('&REFRESH_APP');

00:04:46 7 select sys.jobseq.nextval into job from dual;

00:04:46 8 ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

00:04:46 9 select mdblogh_error_ind into error_ind

00:04:46 10 from ia_admin.mdblogh where mdblogh_jobno=job;

00:04:46 11 if (error_ind <> 'N') then

00:04:46 12 raise_application_error(-20001,'ODS Refresh Failed');

00:04:46 13 end if;

00:04:46 14 csug_ods_refresh.log_end_time('&REFRESH_APP');

00:04:46 15 end;

The error was too generic for me to understand what the issue might be so I tried phoning Mark P for more direction. He was unavailable so I phoned Mark B. Mark advised to check OWB.

I logged into the ODS IA Admin tool. I identified an issue in the UPDATE_MST_ADMISSIONS_REQUIRE mapping. A "value too large" error occurred in the MST_Admissions_Requirement TABLE on the Requirement_Comment column. All the other STUDENT mappings appeared to have run sucessfully.

I phoned Dermot and told him this was a data issue and would have to wait until morning. Immediately aftrward, Mark B. phoned back. He had checked the Ellucian support center and there was a workaround for this issue. He sent me email with SQL to change the size of the Requirement_Column to 4000 characters. I did this, phoned Dermot back and asked him to rerun the job which he is doing.

Thank you Mark Britton for the extra effort above and beyond the call of duty and thank you Dermot for your patience in the wee hours of the morning.

Shawn.

Here is the error:

ORA-12899: value too large for column "ODSMGR"."MST_ADMISSIONS_REQUIREMENT"."REQUIREMENT_COMMENT" (actual: 38, maximum: 30)

I will need to fix the mapping and the underlying table before we can re-start this.

On call news states that Shawn and Mark B. figured this out last night.

Mark. P.

Aborted Module Name: AREGDYIR.AREGS800_01

Date: Day: Time: Resolution:

05/07/13 Tue 00:04 Restarted by Joleen.

Error log and follow up comments:

00:04:56 681 /

old 459: lv_log_path := '&utl_path';

new 459: lv_log_path := '/orautl/BANPROD';

old 460: lv_log_file := '&utl_file1';

new 460: lv_log_file := 'AREGDYIR.AREGS800_01.utl_file1';

DECLARE

ERROR at line 1:

ORA-01400: cannot insert NULL into

("SATURN"."SIBINST"."SIBINST_OVERRIDE_PROCESS_IND")

ORA-06512: at line 147

ORA-06512: at line 272

ORA-06512: at line 516

00:04:56 268 IF cur_existing_sibinst%NOTFOUND THEN

00:04:56 269 /*******************************/

00:04:56 270 /* New SIBINST record - INSERT */

00:04:56 271 /*******************************/

00:04:56 272 pInsert_SIBINST;

00:04:56 273 lv_inst_inserts := lv_inst_inserts + sql%rowcount;

00:04:56 274 ELSE -- Cur_Existing_SIBINST%FOUND

Just a little bit on what I found…

I checked the 8.5.1 student release guide and there are three new columns in the SIBINST table (the sibinst_override_process_ind is required). It looks like AREGS800 does an insert into SIBINST, but doesn’t have this new required column in the insert. All existing rows appear to be defaulted to ‘N’.

Is anyone familiar with this process?

Rob.

It appears that 3 new columns have been added to SIBINST - Faculty Member Base Table.

AREGS800 inserts into SIBINST and the new column SIBINST_OVERRIDE_PROCESS_IND cannot be null.

It appears that the only value in SIBINST_OVERRIDE_PROCESS_IND currently is an ‘N’.

We will need to modify AREGS800 to insert an ‘N’ into SIBINST_OVERRIDE_PROCESS_IND.

SIBINST_OVERRIDE_PROCESS_IND NOT NULL VARCHAR2(1 CHAR)

SIBINST_OVERRIDE_PROC_USERID VARCHAR2(30 CHAR)

SIBINST_OVERRIDE_PROC_DATE DATE

Vicki.

Aborted Module Name: AROSDGLI.AROSS165_01

Date: Day: Time: Resolution:

05/10/13 Fri 22:08 Restarted by Joleen.

Error log and follow up comments:

ORA-12899: value too large for column

ORA-06512: at line 29

ORA-12899: value too large for column

"CSUBAN"."GURAPAY_BACKUP"."GURAPAY_STREET_LINE1" (actual: 39, maximum: 30)

ORA-06512: at line 29

22:08:34 28 begin

22:08:34 29 INSERT INTO gurapay_backup

22:08:34 30 (gurapay_system_id,

22:08:34 31 gurapay_system_time_stamp,

22:08:34 32 gurapay_doc_code,

22:08:34 33 gurapay_user_id,

22:08:34 34 gurapay_pidm,

22:08:34 35 gurapay_id,

22:08:34 36 gurapay_tran_number,

22:08:34 37 gurapay_detail_code,

22:08:34 38 gurapay_desc,

22:08:34 39 gurapay_term_code,

22:08:34 40 gurapay_account,

22:08:34 41 gurapay_dr_cr_ind,

22:08:34 42 gurapay_srce_code,

22:08:34 43 gurapay_last_name,

22:08:34 104 ,GURAPAY_ACTIVITY_DATE

22:08:34 105 ,SYSDATE

22:08:34 106 ,v_process_date

22:08:34 107 FROM gurapay);

22:08:34 108 vin_count := sql%rowcount;

22:08:34 109 end;

The problem record is for gurapay_pidm 11360346 and the gurapay_street_line1 value is "Flat 604 Block 1 No 165 Hepingli Estate", which is 39 characters. Our gurapay_backup table will need to be redefined to accept larger values, but for now can someone truncate the value to 30 characters to get our schedule restarted?

Steven Dove.

Kathy has truncated the value. I restarted AROSDGLI.AROSS165_01 and it has finished running.

Joleen.

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

Date: Day: Time: Resolution:

05/17/13 Fri 05:33 Restarted by Dermot.

05/12/14 Mon 05:30 Restarted by Dermot.

Error log and follow up comments:

04/01/14.

Started processing step electronicInvoiceExtractStep of job KFSXAPEI.electronicInvoiceExtractStep.10537434.10537437.00 for user kr

Executing step: electronicInvoiceExtractStep

#### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.10537434.10537437.00-20130517-05-30-04-337.log

*******************************************************

2013-05-17 05:33:03,467 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.35#> [[ 1 > 0 ]]

<#errtrap_ssh.38#> exit 1

cd /ais02/app/kfs/prd/logs

ls -ltr KFSXAPEI*

2013-05-17 05:33:03,016 [RMI TCP Connection(2)-129.82.111.82] INFO org.kuali.kfs.module.purap.service.impl.Electr

onicInvoiceHelperServiceImpl :: Saving Invoice Reject for DUNS '150982189'

2013-05-17 05:33:03,019 [RMI TCP Connection(2)-129.82.111.82] INFO org.kuali.rice.kns.document.DocumentBase :: in

voking rules engine on document 2412223

cd /ais02/app/kfs/prd/work/staging/purap/electronicInvoice

2013-05-17 07:50:49,494 [RMI TCP Connection(16)-129.82.111.82] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2412331)

2013-05-17 07:50:49,498 [RMI TCP Connection(16)-129.82.111.82] INFO org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 150982189_8053988027_25775814900711560.xml has been rejected

I found a white space within “150982189_8053988027_25775814900711560.xml” which I edited, I removed the “processed” xml files & restarted the failed job!

Dermot.

05/12/14.

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.13211330.13211335.00*.

There was no output file & the Java step ran for just 5 seconds, I checked for processed files in directory:

/ais02/app/kfs/prd/work/staging/purap/electronicInvoice, ls –ltr *processed, there were none so obviously the job never got going so I restarted the step & it completed successfully.

Dermot.

Aborted Module Name: AROSDBIO.AROSS141_01

Date: Day: Time: Resolution:

05/21/13 Tue 18:05 Restarted by Joleen.

Error log and follow up comments:

18:05:35 147 END; --main block.

18:05:35 148 /

old 4: lv_directory VARCHAR2(30) := '&&utl_path';

new 4: lv_directory VARCHAR2(30) := '/orautl/BANPROD';

old 5: lv_logfile VARCHAR2(30) := '&&utl_file1';

new 5: lv_logfile VARCHAR2(30) := 'AROSDBIO.AROSS141_01.utl_file1';

**** Start of AROSS141 **** 05/21/2013 18:05:35

DECLARE

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: Employee and Associate address

updates must be made in the HR system.

ORA-06512: at line 104

Our AROSDBIO flow aborted last night. Can you tell me what the error is?

Steven Dove.

A little bit more info from /orautl/BANPROD/AROSDBIO.AROSS141_01.utl_file1

Insert failed: 829912301 Moore -20100 ORA-20100: Employee and Associate address updates must be made in the HR system.

Steve G.

I’ve found the problem data, but I don’t see a way to delete it through any form. Can someone delete it from the database and then we can restart our schedule? Janet is going to check with the end user to see how they were able to create an address update record for an employee.

-- This should be 1 row

delete from twrcust where twrcust_id = '823493035';

Steven Dove.

Mark P has removed the CSUS_TERM_INFO_CUR, SPR, SMR, FAL from ODSRS002, so we are ready to restart this chain.

Vicki.

Aborted Module Name: ODSRAGEN.ODSRS001_01

Date: Day: Time: Resolution:

05/22/13 Wed 03:24 Restarted by Robin.

Error log and follow up comments:

00:04:49 22 --* THE FOLLOWING 2 LINES ARE REQUIRED

00:04:49 23 --* . -- THIS ENDS THE INPUT MODE FOR

A PL/SQL BLOCK IN SQLPLUS

00:04:49 24 --* / -- THIS EXECUTES THE PL/SQL

BLOCK STORED IN THE BUFFER

00:04:49 26 .

00:04:49 SQL> /

old 6: csug_ods_refresh.log_begin_time

('&REFRESH_APP');

new 6: csug_ods_refresh.log_begin_time

('REFRESH_STUDENT');

old 8: ia_admin.mgkmap.P_RunETLMapSlots('&USERNO',

job, '&REFRESH_APP', NULL, '');

new 8: ia_admin.mgkmap.P_RunETLMapSlots('3',

job, 'REFRESH_STUDENT', NULL, '');

old 14: csug_ods_refresh.log_end_time('&REFRESH_APP');

new 14: csug_ods_refresh.log_end_time

('REFRESH_STUDENT');

The ODS job appears to have failed due to a data error.

The following mapping experienced a unique constraint error on MST_GENSTU_END_TERM_INDEX_01. I will advise scheduling it is ok to let the error go until morning.

UPDATE_MST_GENRL_STDNT_STEP_1

Shawn.

The index on MST_GENSTU_END_TERM is PERSON_UID, ACADEMIC_PERIOD_START (I believe)

If I look at MST_GENSTU_END_TERM and AS_GENSTU_END_TERM for more than 1 occurrence of PERSON_UID, ACADEMIC_PERIOD_START, I do not see it. select PERSON_UID, ACADEMIC_PERIOD_START from as_genstu_end_term group by PERSON_UID, ACADEMIC_PERIOD_START having count(*) > 1; This is getting very serious - no CSUS_TERM_INFO data or updated data and no updates to CSUS_SECTION_INFO data!

We are going to have to figure what to say to campus and get something out on the ODS List Serv this morning

Vicki.

Aborted Module Name: AREGFQTR.SSH_SFTP_01

Date: Day: Time: Resolution:

05/25/13 Sat 12:21 Restarted by David.

Error log and follow up comments:

# COMMAND : /usr/bin/sftp -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109" colora-88@iwantmytranscript.com

# > sftp> pwd

# > Remote working directory: /home/colora-88 # > sftp> lpwd # > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML

# > -rw-rw---- 1 appworx Gprd 291 May 25 12:21 /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML

# > sftp> -ls -l /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml" not found # > sftp> put /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Uploading /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML to /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Couldn't write to remote file "/home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml": Failure # > (1)

This chain sends Electronic Transcripts to TOD. I am not familiar with it and had hoped to see Phil or Rob reply. I am also including Matt in this conversation because he may have a suggestion as to what to do.

Vicki.

I couldn't tell from that output if it means we could not connect to their site, or if there is an internal issue here. I have sent an email to scrip-safe to see if they could check on things on their side of this and see if they find anything not working/setup correctly.

I see that their ordering site is back up. I also just got an email from them asking us to try again, as all their other schools are connecting fine as far as connecting to pull orders and return status files.

Can we retry the job/chain/process?

Matt.

AREGFQTR.SSH_SFTP_01 has been reset and completed successfully. Transcripts are running again.

David.

Looks like about 94 transcripts got processed and had emails sent out from eScrip-Safe at about 4:30 PM our time...

Matt.

Aborted Module Name: FAIDDLDR_EV.RPRDU14_01

Date: Day: Time: Resolution:

05/30/13 Thu 12:52 Deleted by David.

Error log and follow up comments:

%Error% - Invalid or previously processed file (/ais02/dat/finaid/mpninaop.dat)

Processing MPN Due to Expire Report Acknowledgements...

%Error% - Invalid or previously processed file (/ais02/dat/finaid/mpnexpop.dat)

I noticed that the mpndisop.dat and mpninaop.dat message classes are in both the OD and EV FAIDSAIG driver files. Could this be the problem with FAIDDLDR_EV.RPRDU14_01?

David.

We normally have the message classes in both schedules of FAIDSAIG since they’re not aid year specific. The messages listed below are standard errors that come out of the RPRDUXX process. Here’s similar output from the FAIDDLDR_OD from 05/25/13:

Since the same error messages appears on almost every run of FAIDDLDR_OD could there be another reason the EV schedule aborted? (It was the first run of the year and this is a job we “test” in production.)

Karma.

I see that RERIM-LOOP_01 ran about 70 iterations.

I’m not seeing the reason for the RPRDU14 failure. I’ll keep looking. You are saying that the errors reported are typical correct?

David.

Based on what I see the errors listed for the mpndisop.dat, mpninaop.dat and mpnexpop.dat files are consistent with what we see in the OD schedule that’s running with RPRDU13.

There is a new exit counseling file, AHSLDEOP, that’s being brought in with RPRDU14. Is it possible there’s a problem with that file? I looked at the FAIDSAIG_EV output and it didn’t look like we picked any files up but I thought I’d throw it out there.

Karma.

We discovered an appman output scan that was looking for the text ‘error’ in the output log file. This is why RPRDU14 aborted. Since RPRDU14 actually did run successfully, I deleted the component so the Process flow could continue. I have removed this output scan from RPRDU14. If we want to scan for any specific errors in the RPRDU14 we would need to create a new output scan specific to it. I noticed that neither RPRDU13 nor RPRDU12 had output scans so I’m not sure why it was added to RPRDU14.

David.

Aborted Module Name: HRMSENCD.HRMSS079_01

Date: Day: Time: Resolution:

06/05/13 Wed 18:01 Restarted by Dermot.

10/18/13 Fri 18:01 Restarted by David.

Error log and follow up comments:

/ais01/dat/work/prod/HRMSENCD.HRMSS079_01.10680315.10680318.00.2013_06_05_1801_sql_followup

+ cat /ais01/dat/work/prod/HRMSENCD.HRMSS079_01.10680315.10680318.00.2013_06_05_1801_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

06/05/2013 19:03 DBARRETT

Received page for critical job failure in HRMSENCD chain.

It's the error we sometimes see in HRMSS079 with resource busy.

I tried to resubmit HRMSS079, but it failed again with same message.

I called Mark. B (on-call DBA) to check the HR database?

Mark called back & had me restart the failed job & it completed successfully.

Thanks for the information. The next time this happens we will get the session information and dig a bit deper to try and identify the underlying problem.

Steve H.

10/18/13.

Dawn Received page for critical job failure in HRMSENCD chain and notified me.

It's the error we sometimes see in HRMSS079 with resource busy. I tried to resubmit HRMSS079,

it failed again with same message. I called Mark. B (on-call DBA) to check the HR database.

Mark had me re-start several times and we finally got lucky. It completed successfully.

Aborted Module Name: KFSXAPAP.KFSX_JAVA_01

Date: Day: Time: Resolution:

06/11/13 Tue 19:36 See notes below.

Error log and follow up comments:

2013-06-11 19:35:54,717 [main] INFO org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient ::

*******************************************************

Started processing step autoApprovePaymentRequestsStep of job KFSXAPAP.autoApprovePaymentRequestsStep.10724485.10724487.00 for user kr

Executing step: autoApprovePaymentRequestsStep

#### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXAPAP.autoApprovePaymentRequestsStep.10724485.10724487.00-20130611-19-35-47-749.log

*******************************************************

2013-06-11 19:35:54,719 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

2013-06-11 19:35:53,159 [RMI TCP Connection(32)-129.82.111.82] INFO org.kuali.kfs.module.purap.document.service.impl.PaymentRequestServiceImpl :: -- Initial filtering complete, returned 24 docs.

2013-06-11 19:35:53,862 [RMI TCP Connection(32)-129.82.111.82] INFO org.kuali.rice.kns.document.DocumentBase :: invoking rules engine on document 2299826

2013-06-11 19:35:53,867 [RMI TCP Connection(32)-129.82.111.82] INFO org.kuali.kfs.module.purap.document.PurchasingAccountsPayableDocumentBase :: Checking persisted source accounting lines for read-only fields

2013-06-11 19:35:53,878 [RMI TCP Connection(32)-129.82.111.82] INFO org.kuali.kfs.module.purap.document.PurchasingAccountsPayableDocumentBase :: Checking source accounting lines for read-only fields

2013-06-11 19:35:54,115 [RMI TCP Connection(32)-129.82.111.82] ERROR org.kuali.kfs.module.purap.document.validation.impl.PaymentRequestReviewValidation :: validatePaymentRequestReview() Payment Request 254230, Item 1 has quantity '1.00' but outstanding encumbered quantity 0.00

KFSXAPAP.KFSX_JAVA_01 / KFSXAPAP_PURAP_APPROVE_PYMTS ABORTED last night, the job cancelled itself & did not hold up the schedule. The document 2299826 was reported to Swaro & It ran successfully the next evening.

Dermot.

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01

Date: Day: Time: Resolution:

06/13/13 Thu 14:23 Removed illegal characters, Shawn updated table, job restarted.

Error log and follow up comments:

2013-06-13 14:23:18,570 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

This is the output from the sql I ran (see sql script below).

1784540 char = 49840 ° Position:83

1980907 char = 49793 Position:71

1980907 char = 49793 Position:79

1980907 char = 49793 Position:104

1980907 char = 49793 Position:108

2149640 char = 50102 ö Position:4

2271266 char = 49810 ? Position:39

2321383 char = 50076 Ü Position:4

2427751 char = 50051 Ã Position:70

2429665 char = 50051 Ã Position:70

2449892 char = 50051 Ã Position:78

declare

bad varchar2(20);

vchar varchar2(20);

loop_size number;

cursor s1 is

select fdoc_nbr,

dv_chk_stub_txt,

to_number(length(dv_chk_stub_txt)) sz

from fp_dv_doc_t

where dv_chk_stub_txt is not null;

begin

for x in s1 loop

loop_size := to_number(x.sz) + 5;

for i in 1..loop_size loop

bad := ascii(substr(x.dv_chk_stub_txt,i,1));

vchar := substr(x.dv_chk_stub_txt,i,1);

if to_number(bad) > 255 then

dbms_output.put_line(x.fdoc_nbr|| ' char = '|| bad ||' ' || vchar || ' Position:' || i );

end if;

end loop;

End;

Dermot.

Aborted Module Name: FAIDLORC_EV.LYNX_01

Date: Day: Time: Resolution:

06/15/13 Sat 00:29 See note below.

09/23/13 Mon 15:22 Restarted by Joleen.

Error log and follow up comments:

Error:

ORA-12571: TNS:packet writer failure

I don’t know what’s wrong with this one, but I ran that page manually and it didn’t error. Can you start up again from where the job left off? You wouldn’t need to run the LYNX (since I just ran that page), but you could start at the next step.

Sorry, I don’t know more about that one L

Candy.

I bypassed the aborted LYNX_01 and ran the rest of the job. FAIDLORC_DIRECT_LOAN_REC has finished running.

Joleen.

09/23/13.

wc: 0653-755 Cannot open /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.status.txt.

0 /appworx/out/LYNX_11510533.00.stdout.txt

15 /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.stderr.txt

15 total

*** /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.stderr.txt ***

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Alert!: Unexpected network read error; connection aborted.

Can't Access `http://wsnet.colostate.edu/cwis231/autorun/disb_discrepency.aspx?ay=1314'

Alert!: Unable to access document.

lynx: Can't access startfile

***

[101] : *** ERROR Detected in Output : File Empty ***

We apparently had a server app pool issue, but I think you could give it a try again.

Candy.

Aborted Module Name: AROSDBIO.AROSS141_01

Date: Day: Time: Resolution:

06/19/13 Wed 18:05 Restarted by Joleen.

Error log and follow up comments:

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: Employee and Associate address

ORA-06512: at line 104

Robin.

Here is more information on the abort:

Insert failed: 822253958 Garber -20100 ORA-20100: Employee and Associate address updates must be made in the HR system.

Joleen.

I removed the problem record. Can you restart our production schedule?

Steven Dove.

Aborted Module Name: AREGRTWL_SM.SFRBWLP_01

Date: Day: Time: Resolution:

06/24/13 Mon 00:05 Deleted by Joleen, see note below.

Error log and follow up comments:

Here Is the .shl file:

$JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1 progRet=$?

/bin/rm $H/$TEMP.in 1>>$LOG 2>&1

/bin/rm $H/$TEMP.shl 1>>$LOG 2>&1

exit $progRet

Joleen.

sleepwake process flow AREGRTWL_* is being run every 30 minutes in AppMan.

For SM: 05 and 35 every hour

For FA: 09 and 39 every hour

Once sleepwake is running again and the table entry exists AppMan is fine.

At the moment job aborts of Banner job sfrbwlp continue to accumulate in backlog.

The usual email expected to be generated by AppMan AREGRTWL_* sleepwake process restart was NOT sent out last night.

Recipients REGSCHED_LIST@colostate.edu and IS DL: Alert AGEN AREG.

Example from last Sunday:

For Fall:

----------------------------------------------------------------

*** Sleep Wake process SFRBWLP was restarted for FA_SFRBWLP

*** Next Execution: 2013/06/17 00:15:23

*** System Time: 2013/06/17 00:10:56

----------------------------------------------------------------

For Summer:

----------------------------------------------------------------

*** Sleep Wake process SFRBWLP was restarted for SM_SFRBWLP

*** Next Execution: 2013/06/17 00:11:01

*** System Time: 2013/06/17 00:06:52

----------------------------------------------------------------

Gudrun.

I passed this along to Vicki and Phil...I think this might be a set up issue in Banner. I haven't heard back from them yet.

Mark B.

All AWPROD sleepwake AREGRTWL process flows with aborted sfrbwlp jobs can be deleted.

Just keep one in backlog so that a test run can be performed once DBAs have a fix.

I logged issue with them last night but it has been a long day yesterday and not sure how far they got.

Gudrun.

Aborted Module Name: KFSXAPPC.KFSXS074_01

Date: Day: Time: Resolution:

06/24/13 Mon 19:21 Deleted by Dermot.

Error log and follow up comments:

old 6: utlpath varchar2(255) := '&&utl_path';

new 6: utlpath varchar2(255) := '/orautl/kfsprd';

old 7: infile1 varchar2(80) := '&&utl_file1';

new 7: infile1 varchar2(80) := 'KFSXAPPC.KFSXS074_01.utl_file1';

old 8: outfile1 varchar2(80) := '&&utl_file2';

new 8: outfile1 varchar2(80) := 'KFSXAPPC.KFSXS074_01.utl_file2';

**** Start of KFSXS074 06/24/2013 19:21:15

237239 Already Closed.

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

**** STATISTICS *****

Number of Records Read = 225

Number of Records written = 223

Number of Errors = 1

Successfully Complete.

This is a data issue that I will need to look into.

Go ahead and cancel the aborted job.

It looks like all but one PO will get closed.

We can manully close it in the morning.

Josh.

Aborted Module Name: FAIDRGRT_EV.RWRDCLN_01

Date: Day: Time: Resolution:

06/24/13 Mon 21:34 Restarted by Gudrun.

Error log and follow up comments:

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119750[^0-9].*

Search /appworx/out using .*[^0-9']3119750[^0-9].*

Files [/appworx/out/rwrdcln_3119750.log]

User rename pattern:FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.{ONE_UP}.{fileext}

User rename evaluated: FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/rwrdcln_3119750.log

to:/appworx/out/FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.3119750.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl rwrdcln C jobprd pass 3119750 NOPRINT

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

All is not quite well yet in AppMan Banner world. There are two jobs aborted in AWPROD that failed because they encountered a memory fault error during execution.

Only an empty Banner .log file is created. .lis file is missing.

Not all Banner C jobs appear to throw this error so something is different about these jobs.

SWPCOFA is run as part of FAID process flow FAIDCFA2_COF_ATTRIBUTES_2

AGENC001 is run as part of AGEN process flow AGENDYHB_HRMS_NEW_PERSON_BRDG

Could these jobs possibly be recompiled ? I will restart and see if issue remains ?

+ 1>> /appworx/out/swpcofa_3119560.in

+ echo $JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1

+ 1>> /appworx/out/swpcofa_3119560.shl

+ echo progRet=$?

+ 1>> /appworx/out/swpcofa_3119560.shl

Restart of jobs agenc001 and rwrdcln was successful. They finished.

I have to check out swpcofa a bit more before resetting.

If you have compiled swpcofa like the other I believe they should be fine.

Any other custom ones that need recompiling ?

Memory fault error of certain Banner C jobs got fixed. Mark Britton had to recompile them. Apparently these are custom Banner C jobs and this was not taken into account when compiling them the first time.

Reset of agenc001 and rwrdcln was successful.

Gudrun.

Aborted Module Name: AGENDYHB.AGENC001_01

Date: Day: Time: Resolution:

06/24/13 Mon 19:02 Restarted by Gudrun.

Error log and follow up comments:

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119612[^0-9].*

Search /appworx/out using .*[^0-9']3119612[^0-9].*

Files [/appworx/out/agenc001_3119612.log]

User rename pattern:AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

User rename evaluated: AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/agenc001_3119612.log

to:/appworx/out/AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.3119612.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl agenc001 C jobprd pass 3119612 NOPRINT