Aborted Module Name:   FAIDCFAT_SM_GLBDATA-LOOP_01

 

Date:        Day:      Time:          Resolution:

07/11/08     Fri          13:00           See Jan’s reply in follow-up section below.                                                                                                               

 

Error log and follow up comments:

 

 

+ cat /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.DAT

+ 1> /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.selections_todo

cat: 0652-050 Cannot open /ais01/dat/work/prod/FAIDCFAT_SM.GLBDATA-LOOP_01.DAT.

+ exit 2

+ err=2

+ [ 2 -eq 0 ]

+ [ 2 != 0 ]

+ status=ABORTD

 

 

FAIDCFAT_FA.GLBDATA-LOOP_01 (and SM) failed because the conditions on the CHAIN_INIT were not executed.  I manually  executed the commands and restarted the GLBDATA-LOOP.  Chains are complete.

Jan.

 

 

 

Aborted Module Name:   FAIDSNTD.WAIT_FOR_CHAINS_01

 

  Date:        Day:      Time:          Resolution:

02/01/11     Tue         06:29         Restarted by ITS.

  

Error log and follow up comments:

 

*** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

      5719026.01 FAID      TDCLIENT_SEND       02/01 06:29 ABORTED     AWPROD    APPWORX

      5719049.01 FAID      TDCLIENT_SEND       02/01 06:37 ABORTED     AWPROD    APPWORX

      5719059.01 FAID      TDCLIENT_SEND       02/01 06:39 ABORTED     AWPROD    APPWORX

+ exit 1

 

Looks like it failed again - due to the 3 TDCLIENT_SEND modules which are in failed status.  I tried restarting one of the TDCLIENT_SEND's, but it failed again:

+ tdclientc network=saigportal ftpuserid=TG51279 passive=Y data_over_command=y reset transfer=(name=CRDL11IN senduserid=TG51279      send=/ais01/dat/work/prod/CRDL11INsendfile      other_comp_parms=secfile=/ais01/dat/work/prod/CRDL11INsecfile)

+ 1> /ais01/dat/work/prod/TDCLIENT_SEND.CRDL11IN.TXT 2>& 1

+ exit 19

+ err=19.

Apparently there's a communication problem with SAIG - maybe Phil can followup with FAID staff regarding situation with SAIG? Janice.

 

The TDCLIENT password has been changed (via FAIDSPWD_TDCLIENT_CHG_PASSWORD). I restarted the failed TDCLIENT_SEND modules and they completed successfully. Please restart the following failed components:

FAIDSNTD.WAIT_FOR_CHAINS_01

FAIDDLM2_EV.TDCLIENT_01.

Janice.

 

 

 

Aborted Module Name:   FAIDALEX_EV.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

07/14/08     Mon        18:00          David deleted the FAIDALEX chain.

 

Error log and follow up comments:

- sftp

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_elmnet"  SCH05FO@ftp.elmproduction.com

# > Authenticated with partial success.

# > Permission denied (password,gssapi-with-mic).

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

  Child: Job return = 100

14 18:00:24-  Child: put to memory:[100]

 

 

Janice left message below in News File.

“I'll leave the FAIDALEX_EV.SSH_SFTP_01 failure since this is a

communications issue with vendor that will need to be resolved with them.”

 

07/16/08 – Per David – “I deleted the FAIDALEX chain from yesterday.”

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDPPNT_OD.LYNX_01

 

  Date:        Day:      Time:          Resolution:

06/12/13     Wed        11:05          Restarted by Joleen.

 

Error log and follow up comments:

 

 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>

<body>

<img src="http://wsdev.colostate.edu/gld_fr_med_wh.gif"><BR><BR>

<font face="Arial">

 

You have requested a page that either never existed or no longer exists on this web server.<BR><BR>

 

The web page you are visiting is part of a larger site maintained by a department

at <a href="http://www.colostate.edu">Colorado State University</a>.<BR><BR>

If you came to this page via a "bookmark", this page may have been moved.  You may be able to rectify this

error by contacting the webmaster of this website by going to the main site page and finding any contact information on that page.  Otherwise, visit <a href="http://www.colostate.edu">CSU's main website</a>, and use directory search functionality to contact the department responsible for this website.

</font>

 

Hey Joleen, can you please change the lynx module’s url parameter?

From: http://wsprod.colostate.edu/cwis231/autorun/plus_email.cfm?ay={#1}

To: http://wsnet.colostate.edu/cwis231/autorun/JobChain/PlusEmail.aspx

 

It looks like I never asked to switch this over… sorry about that.

Zach.

 

It aborted again. Bummer! I have attached the standard output.

[Win32Exception]: The network path was not found

Here is the URL I used on the one I restarted:

http://wsnet.colostate.edu/cwis231/autorun/JobChain/PlusEmail.aspx

Joleen.

 

Try again, I made some modifications to the connections.

Zach.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDTRAK_EV.LYNX-01

 

  Date:        Day:      Time:          Resolution:

08/08/08      Fri         06:29            Restarted and David confirmed that error reported was correct

                                                       (see contents of Janice’s email below)

Error log and follow up comments:

 

 [100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

+ orig_log_run=Y

+ export orig_log_run

+ log_run=Y

+ export status log_run

+ [ Y = N ]

+ [ -f /Applications Manager/exec/TROUBLE ]

+ [ -f /Applications Manager/exec/TROUBLE.CSU_UTILITIES ]

+ [ -f /Applications Manager/exec/TROUBLE.UNKNOWN ]

+ [ -f /Applications Manager/exec/TROUBLE.APPLICATIONS MANAGER_SHELLS ]

+ [ -f /Applications Manager/exec/TROUBLE.FAIDTRAK_EV.LYNX_01 ]

+ [ -f /Applications Manager/exec/TROUBLE.LYNX.KSH ]

+ export status

+ [ -f /Applications Manager/exec/COMPLETION ]

+ echo Executing COMPLETION

Executing COMPLETION

 

I noticed that the FAID schedule stalled out early last night due to this FAIDTRAK failure.  I know that I haven’t had a chance to discuss the exclude date changes with you, but IT Scheduling may need to use that feature today if yesterday’s FAID schedule runs too long.

 So, briefly – here’s the deal --

The portion of your documented procedure to:

n   Update /ais01/dat/work/prod/AAAAAW99.WAIT_FOR_CHAINS_01.DAT with the current date (in mm/dd format), where AAAA is the 4-character application name

should be modified to:

n  Update Applications Manager subvar #AAAAAW99_EXCLUDE_DATE with the current date (in mm/dd format), where AAAA is the 4-character application name.  If update does not occur until after midnight, you would update with the before midnight date – i.e. the date that corresponds to the “scheduled” run date for the new day’s schedule.

I think this would be the only change to your documented procedure.  The WAIT_FOR_CHAINS  script will automatically set the  Applications Manager subvar #AAAAAW99_EXCLUDE_DATE variable back to “NO_EXCLUDE_DATE” when it completes, so IT Scheduling does **NOT** have to worry about resetting the value when AAAAAW99 completes.

So,  if yesterday’s FAID schedule runs too long and IT Scheduling must perform the after hours monitoring, then they would update the FAIDAW99_EXCLUDE_DATE with 08/08 (after yesterday’s FAID jobs have all completed) as documented in your procedures.

Janice.

 

 

 

 

Aborted Module Name:   SCIQSQ2F.SCIQS009_01

 

  Date:        Day:      Time:          Resolution:

08/13/08     Wed        17:20          Restarted by Jan, see Jan’s comments below.

Error log and follow up comments:

 

 

 

+ + expr 6 + 1

loopcnt=7

+ [ 7 -lt 7 ]

+ [ n = y ]

+ /Applications Manager/exec/FILESIZE SCIQSQ2F.SCIQS009_01.1645544.1645545.00.2008_08_13_1700.jobout 100

no output from SCIQSQ2F.SCIQS009_01

+ err=100

+ date

+ echo exiting  SQLP_CSU Wed Aug 13 17:00:45 MDT 2008

exiting  SQLP_CSU Wed Aug 13 17:00:45 MDT 2008

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

 

SCIQSQ2F.SCIQS009_01 is complete, above was the error.  Kelly created a synonym to resolve.

Jan.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ADMSSATL.SRTLOAD-SRRSRIN_01

 

  Date:        Day:      Time:          Resolution:

08/29/08     Fri          14:38           Restarted by David after Marcella corrected data.

12/11/08     Thu        15:40           Restarted by David, see error he found below.

Error log and follow up comments:

 

 

08/29/08.

chain_status=SRTLOAD_COMPLETE

+ [[ SRTLOAD_COMPLETE != SRRSRIN_COMPLETE ]]

+ print SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

+ exit 1

+ err=1

 

A bad record in SRIPREL was causing this to fail. Marcella purged it and the job finished successfully.

David.

 

12/11/08.

SRTLOAD_SRRSRIN UNSUCCESSFUL - ABORT MODULE

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

I restarted ADMSACTL.SRTLOAD-SRRSRIN_01.

I found the error:

ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","UPDATE GOTCMRT SET GOTCMRT_M...","sql area","tmp")

David.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ADMSWEBC.SRTLOAD_01

 

  Date:        Day:      Time:          Resolution:

09/28/09      Mon       20:00          See note from Janice, Bev & Jan below.

Error log and follow up comments:

 

print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_Applications Manager_joblog_exceptions

+ egrep -i -f /ais01/dat/misc/prod/errstrg_Applications Manager_joblog /Applications Manager/out/ADMSWEBC.SRTLOAD_01.3369958.3369961.00.2009_09_28_2000.AWPROD.LOG

+ 1>> /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ rm -ef /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

rm: removing /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ grep FTP_

+ print ADMSWEBC.SRTLOAD_01

+ rm -ef /ais01/dat/work/prod/ADMSWEBC.SRTLOAD_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

+ awexe get_var_value subvar=#critchain_3369958

N

+ + awexe get_var_value subvar=#critchain_3369958

this_critchain=N

+ [[ N = Y ]]

+ print NON-CRITICAL CHAIN COMPONENT FAILURE

NON-CRITICAL CHAIN COMPONENT FAILURE

+ show_status=ABORTED

+ status=ABORTD

 

+ rm -f /Applications Manager/run/temppar.4703455

+ 1> /dev/null 2>& 1

+ [ 139 != 0 ]

+ echo Non-zero error generated from running job. The program srtload failed to run successfully.

Non-zero error generated from running job. The program srtload failed to run successfully.

From the banner log file (ADMSWEBC.SRTLOAD_01.3369958.3369961.00.1680658.log):

Address information is missing for record with name of

 Sierra Helterbrand and SSN of

From the banner lis file (ADMSWEBC.SRTLOAD_01.3369958.3369961.00.1680658.lis):

Number of Records Read from Tape: 19

Total of Prospects Loaded : 19

Total of PIDMs Matched : 0

Total of Conversion Errors : 22

All of the output files listed above should be viewable via Applications Manager Output File Viewer.       

 Janice.

 

Vicki and I searched the Banner UDC and found an entry that may have a solution.  Mark Britton is checking it out.

Admissions is going to try to run the job without using Applications Manager, to eliminate that as a contributing factor.

Bev.

 

I’ve deleted ADMSWEBC.SRTLOAD_01 to allow the schedule to complete per Vicki.

Jan.

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDALEX_EV.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

08/26/08    Tue         18:00           See Jan’s note below.

Error log and follow up comments:

 

 

The following Applications Manager module is in "EMPTY FILE" status:

 

FAIDALIM.SSH_SFTP_DL_01

 

27 08:16:07-Parent: sleeping for 10 seconds.

27 08:16:17-Parent: (2)Checking child process(786592)

27 08:16:17-Parent: Child process[786592] found

27 08:16:17-Parent: Checking child mem

27 08:16:17-Parent: Value in mem [N]

27 08:16:17-Looking for [/Applications Manager/run/kill.1688722.00]

27 08:16:17-No Kill File found('/Applications Manager/run/kill.1688722.00').

27 08:16:17-Parent: sleeping for 10 seconds.

# > Authenticated with partial success.

# > Permission denied (password,gssapi-with-mic).

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

 

I deleted out all modules left to run and running except CHAIN_FINISH.  I requested FAIDALIM chain in to run again, deleting modules that had successfully run (verifying that the driver file had been built correctly).  FAIDALIM is complete.

Jan.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ODSRTEST.ODSRS002_07

 

  Date:        Day:      Time:          Resolution:

09/09/08     Tue        07:34            Re-started by David.

 

Error log and follow up comments:

 

ERROR at line 8:

ORA-06550: line 8, column 5:

PLS-00201: identifier 'CSUG_ODS_REFRESH.LOG_BEGIN_TIME' must be declared

ORA-06550: line 8, column 5:

PL/SQL: Statement ignored

ORA-06550: line 66, column 5:

PLS-00201: identifier 'CSUG_RUN_OWB_TASK' must be declared

ORA-06550: line 66, column 5:

PL/SQL: Statement ignored

 

ODSRPROD.ODSRS002_07 was set to the wrong login. I corrected this and re-started it.

David.

 

I notice that this is a new component for refreshing HR on ODS.  It’s dependent on the EIDS ODS refresh, but I was wondering if it should be dependent on any HRMS updates completing?

Janice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   DOITDEMO_01.FTPS_CURL_01

 

  Date:        Day:      Time:          Resolution:

09/19/08      Fri         17:00          See Janice’s note below, no output log to display.

06/19/11      Sun       21:47           Restarted by Steve G.

 

Error log and follow up comments:

 

09/19/08.

HRMSAW99 is waiting for DOITDEMO, which failed in the FTPS_CURL. I think this is a test run, so I leave it.  But I put DOITDEMO into the HRMSAW99 exceptions file so HRMSAW99 can complete.  For follow up, DOITDEMO should be changed from the HRMS application/queue to the DOIT application/queue which will eliminate holding up HRMSAW99 in the future…………………..Janice.

 

06/19/11.

#   REMOTE USER        : [$ZB01]

# > curl: Can't open '/ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT'!

# > curl: try 'curl --help' or 'curl --manual' for more information # > (26) #==============================================================================

# FATAL : Command failed with code : 26

#------------------------------------------------------------------------------

# 2011.06.30-21:47:53  : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

# > (100)

 

Elden and I have been looking at this abort, and we think we may have found the problem.   With clues from the error message below, it looks like the DOITDEMO_01.HRMSS051_01.DAT file (utl_file1) was never sent to the /ais01/ftp/to/user directory by the HRMSDEMO_01.HRMSS051_01 component of HRMSDEMO_EXTRACT.  However, the component DID save the same utl_file1 to the /ais01/bkp directory as

HRMSDEMO_01.HRMSS051_01.utl_file1.2011_06_30_2147.bak in a later step.  This appears to be a timing issue. 

If the HRMS folks can look at the

/ais01/bkp/HRMSDEMO_01.HRMSS051_01.utl_file1.2011_06_30_2147.bak file and see if the data looks good, we can copy that file to /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT and see if we can restart the failed DOITDEMO_01.FTPS_CURL_01 component.  Please advise.  Thanks!

Steve G.

 

 

 

Aborted Module Name:   HRMSFRS_SAL.FTP_AIS01_AIS00_01

 

  Date:        Day:      Time:          Resolution:

10/23/08     Thu       11:30            See Janice’s note below.

 

 

Error log and follow up comments:

 

 

Net::FTP=GLOB(0x30334bac)<<< 451-Transfer aborted.  Error during I/O processing.  System code is B37-04

# put    + : Transfer aborted.  Error during I/O processing.  System code is B37-04

 

 

 

The mainframe file ran out of space (B37)– I re-allocated PMDT.APPLICATIONS MANAGER.HRMSFRS.SAL.PFRS05J2.FRS in cylinders instead of tracks and resubmitted this component – it finished successfully.

Janice

 

 

 

Aborted Module Name:   AREGORCC.CHAIN_INIT_01

 

  Date:        Day:      Time:          Resolution:

10/30/08     Thu       15:30          See David’s , Janice’s & Dawn’s note below.

 

 

Error log and follow up comments:

 

this_chain_start=30-Oct-2008 15:22:10

  + cat /dev/null

  + Cannot write to a directory.

  /Applications Manager/csu/exec/CHAIN_INIT.KSH[42]: /ais01/ftp/to/eprint/: 0403-005 Cannot create the specified file.

  + exit 1

  + err=1

  + [ 1 -eq 0 ]

  + [ 1 != 0 ]

  + status=ABORTD

 

 

I talked to Dawn and she will re-submit this. The AREGORCC chain was brought in without a schedule prefix.

David.

 

This chain was requested in without providing the chain prefix value.  Whenever chain components appear in backlog with names like “.CHAIN_INIT_01” – i.e. no chain prefix preceding the “.”,  then it is a problem with the way the chain was requested to run.

By the way, if the “request” procedure is to be used to request this chain, we should probably change the prompt #1 to “value required” which will prevent the chain from being requested without providing a value for prompt #1

Or… we could set up a list of values with the “valid” chain prefix values and then IT Scheduling could use the request procedure and just choose the correct chain prefix value from the LOV (similar to what we are doing with HRMSAW90). 

Janice.

 

When Denise requested this chain to run today, the chain notes said to use the Request procedure.  I spoke to Jan and she had me update the note so it said Schedule procedure.

Dawn.

 

 

Aborted Module Name:   AROSFRQ1.GLBEXTR_POPSEL_01

 

  Date:        Day:      Time:          Resolution:

01/13/10    Wed        01:00           DB error restarted by David.         

 

Error log and follow up comments:

 

I got paged at 1:00am about a DB ERROR on

AROSFRQ1.GLBEXTR_POPSEL_01.  There was no output file, but

Conditions showed Timing of "BEFORE" and  Performed

of "DONE". Called David and left a message with the

information above.  He called back and said he was logging

in to check it out.

Steve.

 

AROSFRQ1.GLBEXTR_POPSEL_01 failed with a DB ERROR. I reset

DONE conditions and re-started. There were no other jobs

with errors.

David.

 

 

 

Aborted Module Name:   ODBAMNTR.ODBAS001_01

 

  Date:        Day:      Time:          Resolution:

09/08/09    Tue        07:00          See note from Janice below.

 

Error log and follow up comments:

 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-12541: TNS:no listener

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

Received EM alerts at approximately 03:00 this morning that production systems were down. It appears the recycle job got hung up. I killed the recycle jobs and manually ran the oracle_system_startup script and oracle_famis_system_startup script.  Everything appears to be up and running now.  Mark. B.

 

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with: ORA-12541: TNS:no listener

   

These ODS refresh chains are dependent on the Applications Manager recycle chain (ODBACYCP_RECYCLE_PROD_SYSTEMS) which contains the ODBA_RECYCL_PRD chain component to execute the /app/oracle/admin/dba/mgr/oracle_system_recycle script.  Even though the recycle did not successfully complete last night, /app/oracle/admin/dba/mgr/oracle_system_recycle apparently returned a zero return code -- thereby allowing dependent ODS Refresh jobs to proceed and subsequently fail due the TNS no listener problem.

Would it be possible for the /app/oracle/admin/dba/mgr/oracle_system_recycle script to return a non-zero return code when such problems occur?  If the error could be trapped, then dependent ODS Refresh Chains would not run until the problem was resolved and appropriate DBA, plus Applications Manager followup, had been done.  When manual activity is taken to resolve the oracle recycle problem, the failed ODBA_RECYCL_PRD chain component of the ODBACYCP_RECYCLE_PROD_SYSTEMS chain would also need to manually be deleted, which would then allow dependent ODS Refresh chains to proceed. 

The dependency connection between the "Oracle System Recycle" chain and ODS Refresh Chains will only be meaningful and effective if failure(s) in the "Oracle System Recycle" script are detected and reported back to Applications Manager. Janice.

 

There was no failure, it got stuck so there really was no way to report this back programmatically.  There looked like there was some code in one of the scripts to detect that the process is hung but as far as I could tell it didn't work. Mark. B.

 

For the second time in two weeks, Sunday night ODS Refresh/Applications Manager Chains had components which failed with Enter user-name: ERROR: ORA-12541: TNS:no listener

   

due to problems with the Oracle recycle process.  Although, the 8/23 situation was apparently not programmatically eligible for reporting back to Applications Manager -- I'm wondering if last night's situation (9/7/09) was something that should/could have been reported back to Applications Manager and/or resulted in a DBA page?  Janice.

 

Nope it wasn't.   Mark. B.

 

 

 

Aborted Module Name:   HRMSSQWL.SQWLARCH-LOOP_01

 

  Date:        Day:      Time:          Resolution:

01/13/09    Mon         07:30         See Debbie’s & Janice’s notes below.

 

 

Error log and follow up comments:

     

+ egrep ABORTED|CRITFAIL|C-Error

+ awexe jh

+ grep 2220862

      2220862.00 BATCH     HRMSSQWL.SQWLARCH_CO01/12 17:23 02:11:48 C-Error     DGUZMAN    HRMSSQWL_STATE_QTRLY_WAGE_LIST

+ print Failure in spawned CO - abort SQWLARCH-LOOP

Failure in spawned CO - abort SQWLARCH-LOOP

+ exit 1

+ err=1

 

SQWLARCH FAILED – Colorado

Please refer to email that was generated from the ABORT on Monday at 5:23pm and follow IT Scheduling instructions. 

Debbie

 

As a reminder, the SQWLARCH-LOOP is structured to automatically email the user (and IT Scheduling) regarding SQWLARCH failures.  The email for the failed SQWLARCH for Colorado was sent when it failed at 05:23 P.M. yesterday.  Elaine should be doing the normal manual follow-up and then will contact IT Scheduling to request a restart.  Please refer to the various HRMSSQWL related emails sent late last week for more details.

After reviewing those emails, if you have any questions about the process then let me know.

Janice

 

The sqwl stuff directly sends a follow-up email to the users (HRSAO SQWLArch Followup), as well as to the Alert HRMS WHRS and Alert APMX lists.  Please refer to the earlier mail, with subject PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado, dated Tue 7/12/2011 9:58 AM.  Because we have this automated feedback reporting for the SQWL failures, there is no need to also send the normal HRMS ABORT followup email.

Janice.

 

 

Aborted Module Name:   HRMSW2P2.WAIT_FOR_W2PDF_01

 

  Date:        Day:      Time:          Resolution:

01/13/09     Tue       20:46           See notes below.

01/12/10     Tue       15:50           See notes below, similar to previous year’s ABORT.

 

Error log and follow up comments:

 

01/13/2009.

There is no output file to look at – what would be our next step?  Thanks...

Look at the module conditions on that module.

Jan.

 

Today Jan was showing us that this module aborts after it cannot find the PDF file after 5 hours.  Janice said this was due to the HRMS CONCURRENT MANAGER JOB FAILURE.  Has the problem been solved?  Could this be the reason this chain has failed again?

 

The W2s have been running for over 5 hours which is causing this abort message.

The problem is that the W2s are taking a long time to generate.

Alan.

 

While this morning the problem was a HRMS CONCURRENT MANAGER JOB FAILURE, the latest abort was simply due to the fact that we exhausted the time interval for checking for the spawned concurrent process(es) to complete.  The spawned concurrent process to generate the W2 PDF’s is still running.  I’ve restarted the failed component and increased the time interval.

Janice

P.S.   I’ve talked with Ken about checking on this chain tonight – so I’ll plan to monitor it over the course of the evening.

 

01/12/2010.

There is no output file.

 

HRMSW2P2.WAIT_FOR_W2PDF_01 timed out at 5 hours waiting for the W2 file to be processed. I gave it more time and re-started it.

David.

 

 

 

 

Aborted Module Name:   HRMSACH_SAL.PAYUSXFR_01

 

  Date:        Day:      Time:          Resolution:

01/21/09     Wed       08:30           See David’s note below.

01/23/09     Fri         09:30            See David’s note below.

 

Error log and follow up comments:

 

 

 

01/21/09.    

PAYUSXFR had some date variables that wrapped to the next field. We manually fixed the variables and the job is complete.

David.

 

01/23/09

HRMSACH_SAL.PAYUSXFR_01 had problems with variables wrapping. I manually fixed this and it is complete.

David.

 

 

 

 

 

 

 

Aborted Module Name:   VSTAJOBS.VPLUS_MIGRATION_01

 

  Date:        Day:      Time:          Resolution:

03/29/10     Mon        07:05          See follow up below.       

 

Error log and follow up comments:

 

mv: 0653-401 Cannot rename vmfiles/CompressGens.lis.old to vmfiles/CompressGens.lis.old.old:

                     A file or directory in the path name does not exist.

+ exit 7

  Child: Job return = 7

29 07:05:40-  Child: put to memory:[7]

 

29 07:05:40-  Child: In memory:[7]

 

  Child:Done.

29 07:05:40-Child:Done

Jan.

 

This job did product error messages for /vptmp/tmp not found  and the ‘mv: 0653-401’ messages when it was rerun a bit later.

I resolved the problem early today and ran the migration process which worked.

For some reason every time we restarted the job it just kept saying it aborted in Applications Manager.

I could not find any reason it kept failing when it worked outside of Applications Manager. My only suspicion is something was searching the log

for Abort messages and it kept thinking it  failed when it didn’t .

The resolution after I fixed the issue was to have Jan delete the job and rerun it fresh. This time it worked fine.

I am looking into a possible issue with some reports captured on Friday night that are in the data base but not in the Vista reports filesystem.

The reports would show in Vista Plus, but not be accessible because the report file is not there. Only a database entry.

We are looking into some reports based on generation sequence number so we might be able to tell what reports it affected.

Then we can cleanup the empty reports and rerun them in Applications Manager if we can.

 

Rich.

 

 

 

Aborted Module Name:   ODSRTEST.ODSRS002_07

 

  Date:        Day:      Time:          Resolution:

02/02/09     Mon        07:39           See notes below.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUH_CURRENT_PERSON

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 75

 

Mapping has been refreshed manually.  HR Test db must have been off-line last night.

Please do not restart the chain.   The refresh of HR on ODS Test will run tonight.

Mark.

 

We deleted this chain.  As it turns out, we should have only deleted the module per Jan.

 

When communicating regarding course of action for failed Applications Manager chain components, it  is important that there is a clear understanding of the Applications Manager terminology (chain component vs. chain) and the difference between deleting an Applications Manager chain component vs. deleting an Applications Manager chain from backlog.  While the desire may have been to not restart the  failed ODSRTEST.ODSRS002_07,  deletion of the ODSRTEST chain not only deleted the failed ODSRTEST.ODSRS002_07 component but also deleted the following chain components:

2009-02-02 07:26:04.0

2009-02-02 07:26:04.0

00:00:00

ODSRTEST.ODSRS004_02

DELETED

ODSRTEST_REFRESH_ODSTEST

ODSR

AWPROD

2318769

2009-02-02 07:26:04.0

2009-02-02 07:26:04.0

00:00:00

ODSRTEST.SEND_MAIL_01

DELETED

ODSRTEST_REFRESH_ODSTEST

ODSR

AWPROD

2318755

2009-02-02 07:26:04.0

2009-02-02 07:26:04.0

00:00:00

ODSRTEST.CHAIN_FINISH_01

DELETED

ODSRTEST_REFRESH_ODSTEST

ODSR

AWPROD

2318753

The ODSRTEST.ODSRS004_02 chain component which was deleted would have updated the ODSTEST csug_ods_refresh_status table  with an “end” time for the  OVERALL_NIGHTLY_REFRESH table entry  – to indicate the time that all ODSRTEST refresh components had completed.  Currently, the  ODSTEST csug_ods_refresh_status table  entry for OVERALL_NIGHTLY_REFRESH has a BEGIN_TIME value of 01-FEB-09 11.00.49 PM, but a null END_TIME value due to deleting the ODSRTEST.ODSRS004_02 chain component.

The ODSRTEST.SEND_MAIL_01 chain component which was deleted would have sent the “ODSTEST Refresh Statistics” summary email to the ODSR email list.

Finally, the ODSRTEST.CHAIN_FINISH_01 chain component which was deleted would have performed chain cleanup, including deletion of chain specific Applications Manager subvars and deletion of /ais01/dat/work/prod/ODSRTEST* work files.  Also, the CHAIN_FINISH component has a BEFORE condition to set the subvar value: #ODSR_RUN_ODSRTEST={#ODSR_RUN_ODSRTEST_SETVAL}

By deleting the CHAIN_FINISH component, this BEFORE condition was not performed which, in this particular situation, did not cause problems because the #ODSR_RUN_ODSRTEST_SETVAL subvar had a same value as the current value of #ODSR_RUN_ODSRTEST.  However, if #ODSR_RUN_ODSRTEST_SETVAL had been different than #ODSR_RUN_ODSRTEST, deletion of this chain component would have resulted in #ODSR_RUN_ODSRTEST having an incorrect value.

Janice.

 

 

 

Aborted Module Name:   AROSFRQ1.TGRAPPL_01

 

  Date:        Day:      Time:          Resolution:

04/27/11     Wed       07:15         See follow up from Janice below..

 

Error log and follow up comments:

 

Username:

Password: Connected.

tgrappl completed successfully

0 lines written to /appworx/out/AROSFRQ1.TGRAPPL_01.6152805.6158460.00.2273961.lis

Starting TGRAPPL (Release 8.1.1.1)

*********************************************************

*                   **WARNING**                         *

*  You cannot submit this job - it is already running.  *

*                                                       *

*  You will also get this message if a previous run of  *

*  this program aborted.  If this is the case, the      *

*  control record for that run must be deleted before   *

*  proceeding. (GJBPRUN record for this jobname with    *

*  a -1 one-up-no).                                     *

*                                                       *

 

There was a timing problem between the AROSFRQ1 spawned AROS_PYMTS chain, in which TGRAPPL is executed, and the AROSDPA3 chain that also executes TGRAPPL.  AROSDPA3 is dependent on AROSAM27_STOP_AROSFRQ1_PYMTS, which creates the /ais01/dat/apwx/prod/AROS-PYMTS-LOOP_daily_stop file.  The presence of this file prevents the AROSFRQ1 AROS-PYMTS-LOOP script from spawning any new AROS_PYMTS chains.  However, in the situation which occurred this morning, it appears that the timing was such that the AROS_PYMTS chain had already been spawned, but not enough time had elapsed between AROSAM27_STOP_AROSFRQ1_PYMTS and AROSDPA3 to allow for the AROS_PYMTS chain to complete. 

I've added a 5 minute delay to AROSAM27 after creation of the AROS-PYMTS-LOOP_daily_stop file, which hopefully will prevent this situation in the future. 

 

I actually thought some database cleanup was required when two TGRAPPL executions stepped on each other's toes... but I did try to restart AROSFRQ1.TGRAPPL, thinking that it wouldn't work anyway.  However, much to my surprise, it completed successfully - sorry, I would have passed the restart on to Dawn had I really thought it would work :)

Janice.

 

 

Aborted Module Name:   HRMSENCD.HRMSS103_01

 

  Date:        Day:      Time:          Resolution:

04/28/09     Tue       07:38           See notes below.

 

 

Error log and follow up comments:

 

HRMSENCD.HRMSS103_01 has been running for over 11 hours, delaying the completion of the HRMS encumbrance processing (Applications Manager chain HRMSENCD_DAILY_ENCUMBRANCES).  Consequently, the remainder of the HRMS/WFRS/WHRS Applications Manager schedules are also waiting for completion of the encumbrance HRMSENCD_DAILY_ENCUMBRANCES chain.

Please advise regarding HRMSS103 – and the course of action which should be taken.

Janice.

 

Craig killed the HRMSENCD.HRMSS103_01 associated Oracle process and we’ve restarted it.  However, historically HRMSS103 only runs 2-3 minutes and it has already been running for more than 10 minutes – how long should we allow HRMSS103 to run?

Janice.

 

When I tried to locate the error, Applications Manager froze.

 

 

I tried to get back into Applications Manager and the same thing happened.

 

Module HRMSENCD.HRMSS103_01 is in CRITFAIL status.  So, Joleen and I both tried to look at the output file so we could send out an e-mail.  Jan called to say the output file was way too big (80599209 Bytes) and trying to open it is what took Applications Manager down.  She does not want anyone to try to open this output file.  I asked her if there was a way to know how big is too big.  She said she was talking to her co-workers about it and they don’t know the answer.  So, please do not open the output file for the above mentioned module.

Dawn.

 

 

 

 

 

 

Aborted Module Name:   AROSDTRN.TSRCBIL_01

 

  Date:        Day:      Time:          Resolution:

09/29/09     Tue         08:30          See note from Janice below.

 

Error log and follow up comments:

 

 

 

The following module is in LAUNCH ERROR status:

AROSDTRN.TSRCBIL_01

I already tried to restart it.

Jan called to say they are working on this.

Dawn.

 

While the banprod_jobprd AWPROD link to banprod is able to execute other sql to populate Applications Manager variables, the problem seems to be isolated to the inability to execute a function via the link – note that the jobprd userid directly logged onto banprod can execute the function(s).

As a short term solution, I’ve made backup copies of the various AROS subvars which execute function(s) in the underlying SQL logic.

Then I manually executed the functions (as jobprd on banprod) to determine the value which would have been returned and hard-coded the resultant values into the corresponding AROS Applications Manager subvars as shown below.  This short term solution of hard-coded values will be adequate until the resultant values would be different from those which I hard-coded – but hopefully it will give us a short time period in which to solve the problem but still allow the various AROS chains to run in the meantime.

#AROS_CUR_TERM        Type=Numbers {200990}

#AROS_CUR_TERM_bkp              Type=Numbers {SQL}

#AROS_NEXT_TERM      Type=Numbers {201010}

#AROS_NEXT_TERM_bkp            Type=Numbers {SQL}

#AROS_PREV_TERM      Type=Character {200960}

#AROS_PREV_TERM_bkp            Type=Character {SQL}

Janice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ADMSAPPL.LYNX_01

 

  Date:        Day:      Time:          Resolution:

05/21/09     Thu       22:20          Restarted by Janice.

 

 

Error log and follow up comments:

 

URL=http://wsprod.colostate.edu/cwis116/application/BanTranPay.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

I talked to Bev about this aborted component and learned that the normal course of action is to just restart it – so it’s running again.

Janice.

 

 

 

Aborted Module Name:   FAIDVRWF_EV.FAIDS025_01

 

  Date:        Day:      Time:          Resolution:

04/14/11     Thu       15:30           See follow up below.

 

Error log and follow up comments:

 

Enter user-name: ERROR:

ORA-12519: TNS:no appropriate service handler found

 

Enter user-name: SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | / Enter user-name: SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus

+ [ -f login.2040276 ]

+ echo Could not login to sqlplus

Could not login to sqlplus

+ err=1

 

Since this is a cyclic chain, we already have 3 more in self wait status.  Please delete 2 of the ones in self wait status, then delete this failed component.  I think the Banner problem has been resolved.

It would be okay to go ahead and delete the FAIDVRWF self wait chains which will keep coming into backlog every 15 minutes - then when the problem is resolved, we'll have been keeping current on cleaning those up.  We won't need/want to run all the backlogged ones. 

Janice.

 

 

Aborted Module Name:   OSYSJOBS_06.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

12/11/09     Thu        16:31          See note from Janice below

 

 

Error log and follow up comments:

 

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

rm: Removing /ais02/log/OSYSJOBS_06.OSYSPURG_01.3680460.3680464.00.2009_12_10_1630.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1  \n*** EXIT  WITH EXIT CODE=1  \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 

*** EXIT  WITH EXIT CODE=1 

***

+ exit 1

  Child: Job return = 1

10 16:31:06-  Child: put to memory:[1]

10 16:31:06-  Child: In memory:[1]

 

The OSYSPURG jobs failed trying to perform cleanup of the /alm_orautl/b directory, which apparently no longer exists.  I’ve created a temp - /ais02/job/temp/sys_purg_rsh.ksh to bypass the “b” instance orautl cleanup.  The logic in sys_purg_rsh.ksh is driven from the /orautl directory, so DBA’s should remove the /orautl/b links which exists on Empire and Kebler to the non-existent /alm_orautl/b directory if the “b” instance has been deleted.  There also may be other obsolete links in the /orautl directories (BAN8@ -> /cre_orautl/BAN8, BANTRNG@ -> /ban_orautl/BANTRNG/).  I also noticed that in some cases the /orautl link points to a /***_orautl directory, but the same directories exist in other directories - example (/orautl/BANTEST@ -> /cre_orautl/BANTEST/ -- but BANTEST also is a subdirectory under the /ban_orautl directory structure).nice

I noticed that the logic in sys_purg_rsh.ksh is so old that we only had the orautl cleanup being performed for the “a”, “b”, and various “hr” instances.  The current /orautl cleanup criteria within sys_purg_rsh.ksh  is any file older than 7 days within the /orautl /a or /orautl/hr* directories.  Should we also be cleaning up the various BAN*, ods* and kfs* instances?  As examples, /orautl/BANPROD has files dating back to 2005  and /orautl/odsprod space used is 70470341, with the most of the large files dated between Jan 2009 and March 2009 . 

Janice.

 

 

 

Aborted Module Name:   AREGORGN.SPOOL_TO_PRINT_02

 

  Date:        Day:      Time:          Resolution:

10/29/10     Fri        17:25             Deleted by Janice.

 

Error log and follow up comments:

 

# -> [*******************************************************************************]

# -> [FATAL EXIT CALLED FROM [spool_filter::fatal]] # -> [-------------------------------------------------------------------------------]

# -> [ERROR: file /ais01/spool/out/AREGORGN.AREGS706_01.5280320.txt not found] # -> [-------------------------------------------------------------------------------]

# -> [[ 2010.10.29-17:25:19 ]]

# -> [RETURN CODE = 100]

# -> [===============================================================================]

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

10/30/2010 10:44    JMWILKIN

I saw the emails from Joleen about the AREGORGN failure and decided to take a quick look.  The failure in the SPOOL_TO_PRINT, as well as a spawned SEND_MAIL, are due to the fact that the utl_file1 from the AREGS706 component was empty.  As followup, there should be a task to correct the AFTER conditions on the AREGS706 so spool driver entry and SEND_MAIL are not done when utl_file1 is empty.

I deleted these failed components so the AREGAW99 chain can complete.

Problem has been resolved – the AREGS706 generated no output to be printed or emailed.  I deleted the failed components so the chain could complete.

Janice.

 

 

 

Aborted Module Name:   FAIDCFIM_FA.COF_RESP_01

 

  Date:        Day:      Time:          Resolution:

09/21/10     Tue       12:20          See notes below.

 

Error log and follow up comments:

 

 

Phil has just informed us that COF will not have a file available today for the FAIDCFIM_FA.COF_RESP_01 to process.  While this component would eventually abort when it doesn’t find the file,  it would be best to simply handle the situation now.

 

Please proceed with the steps outlined below, in the order specified:

1)         Kill the FAIDCFIM_FA.COF_RESP_01 component – it should end up in KILLED status

2)        Delete all the chain components which are in PRED WAIT status, except for the FAIDCFIM_FA.CHAIN_FINISH_01 component.

My preference is to display the chain in backlog via Flow Diagram,  then select all the components to be deleted (in this case, FAIDCFIM_FA.DECRYPT_01 through FAIDCFIM_FA.VPLUS_RCAP-LOOP_01), then right click and select Delete 6 (the 6 indicates  you’ve selected six components to be deleted).

3)        Verify that all chain components which were deleted are in PW-DELETE status.

4)        Delete the “KILLED” FAIDCFIM_FA.COF_RESP_01 component.

5)        Verify that the FAIDCFIM_FA.CHAIN_FINISH_01 component finishes, thereby allowing FAIDCFIM_COF_IMPORT chain to complete.

 

On a more generic note, we often prefer to allow the CHAIN_FINISH chain component to run when we are deleting a chain that has started, but due to a failure or other reasons, is not to run to completion.  One of the key reasons is that the many chain specific subvars which have been defined for the chain will be deleted via the CHAIN_FINISH component, as well as other general cleanup of work files and so on.  However, it cannot be globally said that it would always be safe to run the CHAIN_FINISH component.  Therefore, research would be necessary to determine if the CHAIN_FINISH component (or its associated BEFORE/AFTER conditions) would be taking any action(s) which should NOT be performed.  As an example, the CHAIN_FINISH component of the FAIDCFEX_COF_EXPORT chain has an AFTER condition to request in the corresponding schedule of FAIDCFIM_COF_IMPORT.  Obviously, if we are attempting to delete remaining components of a FAIDCFEX_COF_EXPORT chain, we would NOT want this condition to be performed.  In this case, if we decide to let the CHAIN_FINISH component run, while deleting the remainder of the chain components, we would first have to disable the CHAIN_FINISH conditions to prevent them from running.  CHAIN_FINISH components also may have filenames specified for the “Files to backup”, “Files to empty”, or “Files to delete” prompts which we may not wish to backup, empty or delete.  In general, research is the key to safely allowing the CHAIN_FINISH component to run when deleting the remainder of the chain components.

Janice.

 

 

 

 

 

 

 

Aborted Module Name:   OSYSJOBS_04.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

12/11/09     Thu        16:37         Restarted by Janice.

 

Error log and follow up comments:

 

 

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

rm: Removing /ais02/log/OSYSJOBS_04.OSYSPURG_01.3680452.3680456.01.2009_12_10_1636.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1  \n*** EXIT  WITH EXIT CODE=1  \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 

*** EXIT  WITH EXIT CODE=1 

***

+ exit 1

  Child: Job return = 1

 

 

The OSYSPURG jobs failed trying to perform cleanup of the /alm_orautl/b directory, which apparently no longer exists.  I’ve created a temp - /ais02/job/temp/sys_purg_rsh.ksh to bypass the “b” instance orautl cleanup.  The logic in sys_purg_rsh.ksh is driven from the /orautl directory, so DBA’s should remove the /orautl/b links which exists on Empire and Kebler to the non-existent /alm_orautl/b directory if the “b” instance has been deleted.  There also may be other obsolete links in the /orautl directories (BAN8@ -> /cre_orautl/BAN8, BANTRNG@ -> /ban_orautl/BANTRNG/).  I also noticed that in some cases the /orautl link points to a /***_orautl directory, but the same directories exist in other directories - example (/orautl/BANTEST@ -> /cre_orautl/BANTEST/ -- but BANTEST also is a subdirectory under the /ban_orautl directory structure).

I noticed that the logic in sys_purg_rsh.ksh is so old that we only had the orautl cleanup being performed for the “a”, “b”, and various “hr” instances.  The current /orautl cleanup criteria within sys_purg_rsh.ksh  is any file older than 7 days within the /orautl /a or /orautl/hr* directories.  Should we also be cleaning up the various BAN*, ods* and kfs* instances?  As examples, /orautl/BANPROD has files dating back to 2005  and /orautl/odsprod space used is 70470341, with the most of the large files dated between Jan 2009 and March 2009 . 

Janice.

 

 

 

Aborted Module Name:   FAIDCFAT_FA_GLBDATA-LOOP_01

 

  Date:        Day:      Time:          Resolution:

09/07/10     Tue         06:00           See note from Janice below.

Error log and follow up comments:

 

 

I’m including the DBA’s on this email, as it appears with the many Appworx failures we have a problem with the databases (all appropriate instances are in restricted mode).

The error in FAIDCFAT_FA.GLBDATA-LOOP_01 was:

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

Consequently, to determine the source of the problem, the output log from the spawned GLBDATA must be viewed to determine what caused the failure in FAIDCFAT_FA.GLBDATA_01:

ERROR:

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

Oh.. wait, it was giving us LAUNCH ERRORS so I thought it might be AWPROD… but it was still related to BANPROD because the module attempting to launch uses Appworx subvars which query BANPROD to obtain the value for the subvar.

I’ll try it again after I update the so_job_queue table because it has been retried too many times and the Operator Log for that job has filled up.  As a reminder, IT Scheduling should further investigate the problem and/or solicit help if a LAUNCH ERROR status chain component repeatedly goes into LAUNCH ERROR status upon retry.  An easy way to investigate is to view the Operator log for the component – in this case, it revealed the following error:

2010-09-07 07:13:06 status action QUEUED by JWEARNE

RmiServer 09-07-2010 07:13:07 MDT

Job launch error: 5013708.06 agent: AWPROD host: kebler.is.colostate.edu

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

 

In this case, an ORA message was displayed each time the component was resubmitted.

Janice.

 

 

Aborted Module Name:   AGENDYGN.AGENS006_01

 

  Date:        Day:      Time:          Resolution:

07/10/09     Fri         20:00           See Janice’s comments below.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-29282: invalid file ID

ORA-06512: at "SYS.UTL_FILE", line 802

ORA-06512: at line 671

ORA-06512: at line 1511

ORA-29280: invalid directory path

 

I noticed the AGENDYGN.AGENS006_01 failure and tried a resubmit, but it failed again.  Weird situation -- looks

like it doesn't like the /orautl/BANPROD directory.. Oh wait.. sql changed today although last modlog entry is

6/12/09 and the utl path is hard-coded in the sql as /orautl/BANTEST while the line to use &&utl_path has

been commented out??? I could fix that, but maybe the version of sql in prod isn't what should be there -- i.e.

why was it changed today and no recent modlog entry is present?  Why is the version in prod using /orautl/BANTEST

hardcoded logic?  Sounds like a test version of the sql got placed into production, so I'll let others followup in the

A.M.

Janice.

 

 

Aborted Module Name:   AROSDGL1_AROSS167_01

 

  Date:        Day:      Time:          Resolution:

12/16/09     Wed        09:33          Restarted by David.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-20100: ::ID does not exist.::

ORA-06512: at "BANINST1.GB_COMMON", line 451

ORA-06512: at line 367

 

09:32:03 367      fetch trx_cur into trx_rec;

I am assuming that something changed for a person between the load of GURFEED and the run of AROSS167.

I am looking into the bad record now. I will respond with a solution shortly.  Josh

 

There are 80 total transactions with an invalid GURFEED_ID.  Below is the breakdown.

1              null                        62

2              824109854           18

I will follow up with AR to see why these do not have valid ID’s. Josh

 

The Null values are not causing any problems.

The 824109854 has been modified.  Vicki is working to determine what was going on with that ID.  Once a decision has been made on what to do with it we can restart the module.

Josh.

 

Module was restarted and is now complete.

Jan.

 

 

 

 

Aborted Module Name:   HRMSQPD.CHAIN_SUMMARY_01

 

  Date:        Day:      Time:          Resolution:

07/07/09     Tue        08:40           No follow up received.  

 

Error log and follow up comments:

 

 

 

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

 - begin    if lower(:so_user_name

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

The Appworx userid was deactivated by a failed mkbanner.  Mark will research why this happened – might be an issue with the banprod_general link.  The mkbanner works fine on AWTEST, using the bantest_general link.  I reset the appworx userid within Appworx to “active” and restarted the following jobs which had failed with this error:

Janice.

 

 

 

 

 

 

Aborted Module Name:   HRMSDAY1.HRMSS009_01

 

  Date:        Day:      Time:          Resolution:

12/16/09     Wed      06:32            See note from Janice below.

Error log and follow up comments:

           

 

Both HRMSS007 and HRMSS009 failed with:

ERROR at line 1:

ORA-12541: TNS:no listener

which I believe was caused by problems with ODS or this link to ODS.   Both of these sqls use the csug_gp_demo_v view, which selects data from csuban.csug_gp_demo@odsprod.world

When I attempt via an sql (on hrprod) to just select count using this link to odsprod, the same TNS no listener error is produced:

07:20:07 SQL> select count(*) from csuban.csug_gp_demo@odsprod.world

07:22:49   2  /

select count(*) from csuban.csug_gp_demo@odsprod.world

                                         *

ERROR at line 1:

ORA-12541: TNS:no listener

Janice.

 

 

 

 

 

Aborted Module Name:   KFSXAPIM.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

09/24/09     Fri          07:04           See response from Janice.

  

Error log and follow up comments:

 

The error is :

ERROR: Exception caught:

Caused by: org.springframework.beans.factory.CannotLoadBeanClassException: Error loading class [edu.csu.kfs.fp.batch.service.impl.ProcurementCardCreateDocumentServiceImpl] for bean with name 'procurementCardCreateDocumentService' defined in class path resource [edu/csu/kfs/fp/spring-fp.xml]: problem with class file or dependent class; nested exception is java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Caused by: java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Same error occurred in KFSXAPIM.KFSX_JAVA_01. 

Last week we encountered this error in kfsdevl and it required a change to the kfsdevl_Applications Manager.env file.  I’m wondering if the build last night to kfsprd now requires the same change to kfsprd.env file?  Email traffic related to kfsdevl.env shown below:

Shawn.

 

Can you add $LIB_KFS/groovy-all-1.6.4.jar:\  to kfsdevl_Applications Manager.env…………Kevin

 

Can a DBA please follow-up on this?

By the way, encumbrances did NOT feed to KFS last night due to this problem – that’s one of the jobs that failed.  Since HR has requested that no encumbrances run starting tonight through Sept 30, it is critical that we successfully post last night’s encumbrances to KFS by finishing out this job.  However, the design of the job is to shutdown Tomcat – run scrubber/poster – then start Tomcat back up.  This will impact KFS users!!

Since the post of encumbrances did not happen last night, the WHRS_CUR_FY_JOBACCT_MTH_00 table refresh was skipped.  However, since we need to post encumbrances to KFS and it’s running now -- I requested the WHRSL023 chain back into backlog so we can force a refresh of the WHRS_CUR_FY_JOBACCT_MTH_00 table.  Otherwise, we would be out of sync – plus WHRS_CUR_FY_JOBACCT_MTH_00 table won’t be refreshed for the rest of the month due to HR suspending HRMS encumbrance processing through Sept 30.

Janice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   KFSXCS20.HRMSS174_01

 

  Date:        Day:      Time:          Resolution:

07/21/09     Tue       21:11            See Janice’s note below.

 

Error log and follow up comments:

 

 

ERROR at line 1:

ORA-01410: invalid ROWID

ORA-06512: at line 228

 

21:11:35 228  for C1 in get_encumbrance_amounts (l_start_date, l_end_date) loop

 

This will be a good test of the feature for the later GL update chain (KFSXGL_D2) to proceed without encumbrances - it will be released by KFSXAW11 at 1 A.M. -- although I may

just release it now since there isn't anything to wait for (KFSXCS20 already failed).

Janice.

 

 

 

 

Aborted Module Name:   FAIDTMIM.TDCLIENT_01

 

  Date:        Day:      Time:          Resolution:

04/30/10     Fri         08:33           Restarted by Janice.

 

Error log and follow up comments:

 

I could not find lines 54 or 110.

 

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

 - begin    if lower(:so_user_name

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

The Appworx userid was deactivated by a failed mkbanner.  Mark will research why this happened – might be an issue with the banprod_general link.  The mkbanner works fine on AWTEST, using the bantest_general link.  I reset the appworx userid within Appworx to “active” and restarted the following jobs which had failed with this error:

Janice.

 

 

 

Aborted Chain Name: AROSDGLI.SEND_MAIL_01

 

  Date:        Day:      Time:          Resolution:

08/13/09     Thu         19:12          See note from Janice below.

 

Error log and follow up comments:

 

 

 

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cp: /orautl/BANPROD/AROSDGLI.AROSS162_01.utl_file1: A file or directory in the path name does not exist.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

 

Josh will manually create this missing file (and associated .recon file)  and we will manually copy these files to the appropriate filenames in /ais01/dat/aros/prod directory so tonight’s KFSX Enterprise Feed will pick them up.

I’ve restarted the failed component to allow the AROS schedule to proceed.

Janice.

 

 

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

08/13/09     Thu       07:04          See Janice & John Hunter’s note below.

 

 

Error log and follow up comments:

 

KFSXFPPD Failure - Error:

2009-08-13 07:08:25,393 [main] INFO  org.kuali.rice.kew.docsearch.SearchableAttribute :: ...finished indexing document 359577 for document search.

2009-08-13 07:08:26,334 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.3178509.3178550.00 steps: [disbursementVoucherPreDisbursementProcessorExtractStep]

2009-08-13 07:08:26,335 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.IllegalArgumentException: Unable to find customer profile for M/CSU/DV

RunBatch ERROR: Exception found:

java.lang.IllegalArgumentException: Unable to find customer profile for M/CSU/DV

 

Just as with the problem yesterday with the chain component failure within KFSXPDSA,  this KFSXFPPD failure has halted the daily KFSXPD_DY_PDP_DAILY_CHECK_ACH Applications Manager chain.  The PDP Daily Check/ACH Processing for 12-AUG-2009 Summary email will not be sent, nor will any of today’s ach or check processing be performed until the problem with KFSXFPPD is resolved.

Janice.

 

For the Library feed, we had M in this table, I’ve changed it to MC.

 

 

John Hunter.

 

 

Aborted Module Name:   KFSXCS52.KFSXS007_01

 

  Date:        Day:      Time:          Resolution:

12/16/09     Tue       06:32          See note from Janice below.

 

Error log and follow up comments:

 

 

One more production job which failed trying to connect to odsprod:

ERROR at line 1:

ORA-12541: TNS:no listener

ORA-06512: at line 250

6:32:30 250              select rtrim(FIRST_NAME), rtrim(LAST_NAME) , rtrim(MIDDLE_NAME), rtrim(SUFFIX_NAME)

06:32:30 251                         ,WORK_PHONE, rtrim(EMAIL), DEPARTMENT_NUMBER, rtrim(ENAME), t1.csu_id

06:32:30 252                         ,HR_EMPLOYEE_TYPE, WEID_EMPLOYEE_TYPE

06:32:30 253              INTO ODS_FIRST_NAME, ODS_LAST_NAME, ODS_MIDDLE_NAME, ODS_SUFFIX_NAME, ODS_WORK_PHONE

06:32:30 254                          ,ODS_EMAIL, ODS_DEPARTMENT, ODS_ENAME, ODS_CSU_ID

06:32:30 255                          ,ODS_HR_EMPLOYEE_TYPE, ODS_WEID_EMPLOYEE_TYPE

06:32:30 256           from csuf_employee_primary               t1

06:32:30 257           where t1.employee_number = X.KFS_PRNCPL_ID

06:32:30 258             and rownum = 1;

 

Csuf_employee_primary is a view on kfsprd, selecting data from csuf_employee_primary@kfs_to_ods

Janice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS001_02

 

  Date:        Day:      Time:          Resolution:

08/24/09     Mon        09:20          See Jan & Janice’s note below.

 

Error log and follow up comments:

 

+ /Applications Manager/exec/FILESIZE ODSRAGEN.ODSRS001_02.3221972.3221976.00.2009_08_24_0313.jobout 100

no output from ODSRAGEN.ODSRS001_02

+ err=100

+ date

+ echo exiting  SQLP_CSU Mon Aug 24 03:13:53 MDT 2009

exiting  SQLP_CSU Mon Aug 24 03:13:53 MDT 2009

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

ODSRAGEN.ODSRS001_02 is complete.

Jan.

 

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with:

Enter user-name: ERROR:                                                         

ORA-12541: TNS:no listener   

 

These ODS refresh chains are dependent on the Applications Manager recycle chain (ODBACYCP_RECYCLE_PROD_SYSTEMS) which contains the ODBA_RECYCL_PRD chain component to execute the /app/oracle/admin/dba/mgr/oracle_system_recycle script.  Even though the recycle did not successfully complete last night, /app/oracle/admin/dba/mgr/oracle_system_recycle apparently returned a zero return code -- thereby allowing dependent ODS Refresh jobs to proceed and subsequently fail due the TNS no listener problem.

 

Would it be possible for the /app/oracle/admin/dba/mgr/oracle_system_recycle script to return a non-zero return code when such problems occur?  If the error could be trapped, then dependent ODS Refresh Chains would not run until the problem was resolved and appropriate DBA, plus Applications Manager followup, had been done.  When manual activity is taken to resolve the oracle recycle problem, the failed ODBA_RECYCL_PRD chain component of the ODBACYCP_RECYCLE_PROD_SYSTEMS chain would also need to manually be deleted, which would then allow dependent ODS Refresh chains to proceed. 

 

The dependency connection between the "Oracle System Recycle" chain and ODS Refresh Chains will only be meaningful and effective if failure(s) in the "Oracle System Recycle" script are detected and reported back to Applications Manager.

Janice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS002_01

 

  Date:        Day:      Time:          Resolution:

08/24/09     Mon        09:20          See Janice’s note below.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number  with name "" too small

ORA-02063: preceding line from HR_ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2426

ORA-06512: at line 48

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

07:19:00  48        dbms_mview.refresh('CSUBAN.SPRIDEN_EMPN_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

 

Due to the problem with the Oracle recycle process Sunday night, all of the ODS Refresh Applications Manager Chains had components which failed with:

                                                                            

Enter user-name: ERROR:                                                        

ORA-12541: TNS:no listener   

 

These ODS refresh chains are dependent on the Applications Manager recycle chain (ODBACYCP_RECYCLE_PROD_SYSTEMS) which contains the ODBA_RECYCL_PRD chain component to execute the /app/oracle/admin/dba/mgr/oracle_system_recycle script.  Even though the recycle did not successfully complete last night, /app/oracle/admin/dba/mgr/oracle_system_recycle apparently returned a zero return code -- thereby allowing dependent ODS Refresh jobs to proceed and subsequently fail due the TNS no listener problem.

 

Would it be possible for the /app/oracle/admin/dba/mgr/oracle_system_recycle script to return a non-zero return code when such problems occur?  If the error could be trapped, then dependent ODS Refresh Chains would not run until the problem was resolved and appropriate DBA, plus Applications Manager followup, had been done.  When manual activity is taken to resolve the oracle recycle problem, the failed ODBA_RECYCL_PRD chain component of the ODBACYCP_RECYCLE_PROD_SYSTEMS chain would also need to manually be deleted, which would then allow dependent ODS Refresh chains to proceed. 

 

The dependency connection between the "Oracle System Recycle" chain and ODS Refresh Chains will only be meaningful and effective if failure(s) in the "Oracle System Recycle" script are detected and reported back to Applications Manager.

Janice.

 

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS002_01

 

  Date:        Day:      Time:          Resolution:

09/08/09     Tue       01:41          See note from Janice below.

01/22/14     Wed      02:04          Deleted by Dermot.

 

Error log and follow up comments:

09/08/09.

ORA-12008: error in materialized view refresh path

ORA-04052: error occurred when looking up remote object

ODS_USER.CSUH_AGEN_CSUID_V@HR_ODS_USER

ORA-00604: error occurred at recursive SQL level 2

ORA-12541: TNS:no listener

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2251

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2457

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2426

ORA-06512: at line 48

 

By the way, if AGEN refreshes are going against the HRMS database, there should be a discussion about whether dependencies need to be added to ensure that HRMS database updates have completed before the AGEN refreshes occur.    All such cross system ODS refreshes should be identified and appropriate analysis regarding dependency connections should be performed. 

It may even be desirable to separate out such cross system ODS refreshes into separate Applications Manager refresh chain(s) – for example if the AGEN refresh which utilizes HR were in a separate chain, then a failure due to HRMS being down (such as occurred last night) would not cause all the rest of the AGEN refreshes to be halted.  Likewise, if the AGEN refresh which utilizes HR were placed into the HRMS refresh chain, then BANPROD being down could potentially cause all the rest of  the HRMS refreshes to be halted.  However, if such cross system refreshes were isolated into separate ODS refresh chain(s), then the impact of database(s) being down would be isolated only to the refreshes for the “down” database and those isolated cross system refreshes using that “down” database.

Janice.

 

01/22/14.

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 16 with name "_SYSSMU16_1294186362$" too small

ORA-02063: preceding line from BANPROD@ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

ORA-06512: at line 87

I'm not sure where to go with ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 16 with name "_SYSSMU16_1294186362$" too small

Should we just restart ODSRS002?  Or do we need some DBA magic?

Vicki.

 

In the past we have found that restarting puts us back at the beginning of the process, which is unnecessary and can set the schedule back HOURS...

Steve. G.

 

The ODSRAGEN.ODSRS002_01 was on the last MV refresh --  I will run it and let you know when it is done.

The CSUBAN.CSUS_TEST_SCORES_MV failed again with the same error.

I think with Banner database being so busy with registration it just won't complete.

Do you want to abort it for today and try again tomorrow?

Let me take a look at the undo advisor and see if I can make some changes.  This is caused because there are many updates going on at the same time something that runs a long time is querying the updated tables.  I'd advise against running this during the day if it fails overnight.

Mark. P.

 

This might be relevant - Admissions is loading test scores in to Banner manually (via the API) every morning, for the most part.  I confirmed that they loaded 654 rows into sortest this morning. 

Kathy B.

 

 

 

Aborted Module Name:   HRMSCPR_SAL_HRMSS063_01

 

  Date:        Day:      Time:          Resolution:

09/22/09     Tue       12:29            See note below.

Unable to open error module, to view the log one needs to go to the master module which should be in an INITIATED status.

Double click and you will get the screen which is displayed below from there you can view the output file to find the error.

 

Error log and follow up comments:

I could not open it to locate the error.   Dawn.

The error can be found in the concurrent manager out file (o3877725.out), which can be viewed via Explorer window,  Output Files tab for this failed chain component.

Error message from out file:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1062.    Janice.

What is a concurrent manager? Dawn.

The HRMS programs which run via Applications Manager OAE (Oracle Apps Extension) are HRMS Concurrent Manager programs – i.e. programs defined to the HR Application Concurrent Manager Feature.  While these concurrent manager programs can be scheduled within the HRMS application itself, we have chosen to instead schedule them via Applications Manager using the Applications Manager add-on OAE product.  Basically, when one of these OAE type programs is a chain component, then Applications Manager will interface to the HRMS Concurrent Manager to submit the concurrent manager program to run.  Likewise, upon completion (successful or not) of the HRMS Concurrent Manager program, the Applications Manager OAE interface retrieves output listings from the HR Application and presents them for viewing via the Applications Manager Explorer Window (Output Files Tab).  This is similar to the interface with Banner programs in that the Banner log and lis files are available for viewing via Applications Manager Explorer Window (Output Files Tab).

Janice.

 

 

Aborted Chain Name: KFSXFPPC.SEND_MAIL_01

 

  Date:        Day:      Time:          Resolution:

06/15/11     Wed      15:05            Restarted by Steve Greene.

 

Error log and follow up comments:

 

You may have seen this abort (KFSXFPPC.SEND_MAIL_0) awhile ago.  I noticed that the error was the same as one from a week ago on another chain:

 

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

at /appworx/csu/exec/SENDMAIL.PL line 792

error is 255

 

I still had an email from Janice about the previous abort, so I asked her if this was the same scenario.  She said yes, but the conditions were simpler, and there was no need to redo or delete the conditions.  She said the module could just be restarted, which I did.  It finished successfully.

Stevie G.

 

 

 

 

 

 

Aborted Chain Name: WHRSL022.SQLLOAD-LOOP_01

 

  Date:        Day:      Time:          Resolution:

09/23/09     Wed      06:14            See note from Janice & Diane.

11/29/12     Thu       06:14            See note from Gudrun & Mark.

 

Error log and follow up comments:

 

 

09/23/09.   

module is in LOADFAIL status:

 

The load for WHRS_CUR_FY_EXPHIST_00 on ODSPROD is failing due to a missing column in the table definition.  The table definition will need to be fixed by a DBA before the failed production chain component can be restarted.

Janice.

 

SQL*Loader-466: Column SUBACCT does not exist in table "CSUHR"."WHRS_CUR_FY_EXPHIST_00".

Janice.

 

I think I got this incident late yesterday. 

Diane.

 

11/29/12.

+ [ -f /app/oracle/product/11.2.0.3/bin/sqlldr ]

+ echo Could not find Sql Loader executable

Could not find Sql Loader executable

+ exit 2

 

ORACLE_HOME for odsprod points to /app/oracle/product/11.2.0.3/ in /etc/oratab on kebler. YET this Oracle Home DOES not have a sqlldr.

 

Please change in /etc/oratab the ORACLE_HOME to 11.1.0.7. It does have a sqlldr.

 

Otherwise if not possible we will have to conditionally set ORACLE_HOME, PATH and LIBPATH in our PREFIX.@odsprod script.

 

/etc/oratab

odsprod:/app/oracle/product/11.2.0.3:N

odstest:/app/oracle/product/11.2.0.3:N

odsdevl:/app/oracle/product/11.2.0.3:N

 

Once ORACLE_HOME issue is resolved APMX team can recover job.

Gudrun.

 

This is why I didn’t want to upgrade the client on Kebler.  I will put it back the way it was.

I put the Oracle client back to 10.2.0.3 like it was prior to the install of the 11.2.0.3 client on Kebler.  I also added the Oracle utilities to the 11.2.0.3 installation which is the actual correct solution to the problem.  Would you like to use the old client or the new client for ODS?

Mark.

 

 

 

 

 

 

 

 

Aborted Module Name:   KFSXPDSA.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

09/23/09     Wed      08:14            See note from Janice & Kevin.

 

Error log and follow up comments:

 

KFSXPDSA.KFSX_JAVA_01 (pdpSendAchAdviceNotificationsStep) failed. 

 As a reminder, the daily PDP check cycle will not proceed until this failed chain component is either deleted or successfully completes.

 

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <www.medsourceinc.us.com>... User unknown

                at org.springframework.mail.javamail.JavaMailSenderImpl.doSend(JavaMailSenderImpl.java:407)

                at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:298)

                at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:284)

                at org.kuali.rice.kns.service.impl.MailServiceImpl.sendMessage(MailServiceImpl.java:60)

                ... 6 more

2009-09-23 08:13:10,878 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3350895.3350924.00 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-23 08:13:10,878 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

                at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

                at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

                at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Janice.

 

The offending email address has been set to BFS_AcctPay@mail.colostate.edu.  The job can be reran.

Kevin.

 

 

Aborted Module Name:   KFSXGLEF_D2.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

09/24/09     Thu        22:01            See note below from Janice & Kevin.

 

Error log and follow up comments:

 

ERROR: Exception caught:

Caused by: org.springframework.beans.factory.CannotLoadBeanClassException: Error loading class [edu.csu.kfs.fp.batch.service.impl.ProcurementCardCreateDocumentServiceImpl] for bean with name 'procurementCardCreateDocumentService' defined in class path resource [edu/csu/kfs/fp/spring-fp.xml]: problem with class file or dependent class; nested exception is java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Caused by: java.lang.NoClassDefFoundError: groovy.lang.GroovyObject

Same error occurred in KFSXAPIM.KFSX_JAVA_01. 

Last week we encountered this error in kfsdevl and it required a change to the kfsdevl_Applications Manager.env file.  I’m wondering if the build last night to kfsprd now requires the same change to kfsprd.env file?  Email traffic related to kfsdevl.env shown below:

Shawn,

Can you add $LIB_KFS/groovy-all-1.6.4.jar:\  to kfsdevl_Applications Manager.env

Kevin.

 

Can a DBA please follow-up on this?

By the way, encumbrances did NOT feed to KFS last night due to this problem – that’s one of the jobs that failed.  Since HR has requested that no encumbrances run starting tonight through Sept 30, it is critical that we successfully post last night’s encumbrances to KFS by finishing out this job.  However, the design of the job is to shutdown Tomcat – run scrubber/poster – then start Tomcat back up.  This will impact KFS users!!

 

Since the post of encumbrances did not happen last night, the WHRS_CUR_FY_JOBACCT_MTH_00 table refresh was skipped.  However, since we need to post encumbrances to KFS and it’s running now -- I requested the WHRSL023 chain back into backlog so we can force a refresh of the WHRS_CUR_FY_JOBACCT_MTH_00 table.  Otherwise, we would be out of sync – plus WHRS_CUR_FY_JOBACCT_MTH_00 table won’t be refreshed for the rest of the month due to HR suspending HRMS encumbrance processing through Sept 30.

Janice.

 

 

 

 

Aborted Module Name:   HRMSCHK_QPH.CHECK_WRITER_02

 

  Date:        Day:      Time:          Resolution:

09/25/09     Fri         10:30            See note from Janice.

 

Error log and follow up comments:

 

The following module is in DB ERROR status:

 

HRMSCHK_QPH.CHECK_WRITER_02

 

Occasionally, we encounter a problem where Applications Manager thinks that there is not enough room in the SO_LOG column of the SO_JOB_QUEUE table to store the information Applications Manager is logging related to condition actions.  Usually the RmiServer log provides a hint that this is the problem with an error like this:

ORA-12899: value too large for column – and references the SO_LOG column.

This time the error message said the problem was with the SO_REQUEST_DATE column which doesn’t really make any sense at all.  At any rate, the result is still the same – the module goes into DBERROR status while trying to perform the BEFORE conditions associated with the module.    In the particular case of the CHECK_WRITER component on which today’s DBERROR occurred, it is even more confusing because the previous execution of this same module with exactly the same conditions just moments before finished just fine!

To fix this problem, replace the contents of the SO_LOG column in the SO_JOB_QUEUE table for this specific jobid with something small (or just a null value). 

For example:

# export ORACLE_SID=awprod

# export TWO_TASK=awprod

# sqlplus Applications Manager

SQL*Plus: Release 10.2.0.3.0 - Production on Fri Sep 25 10:40:35 2009

Enter password:

Connected to:

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

10:40:39 SQL> update so_job_queue

10:40:51   2  set so_log = 'DB ERROR(QUEUED) 2009-09-25 10:15:26'

10:43:09   3  where so_jobid= '3361819'

10:43:30   4  /

1 row updated.

10:43:36 SQL> commit

10:43:43   2  /

Commit complete.

 

If you don’t have the Applications Manager password, Greg or Rich could perform this sql update.

Then the job can be restarted.  Of course, each situation is different and the associated CONDITIONS would need to be carefully examined to determine how/if it can be restarted and whether any conditions have already been tagged as DONE that would need to be reactivated, etc.

Let me know if you have questions – I don’t want to be the only one who knows how to fix this!!

 

Another method to help identify this problem is to view the Operator Log (via Applications Manager Explorer, Operator Log Tab) for the chain component with the DBERROR.  If the log display is very long and appears to possibly be truncated at the end – then the “value too large for column” may have occurred for the SO_LOG column.  By the way, I would report this problem to UC4 but they would probably tell us to upgrade J and/or not be able to reproduce the problem.  It doesn’t happen very often – but in the past, it’s basically been impossible to get the component restarted without manually updating the SO_LOG column to allow room for Applications Manager to log activity associated with that chain component.

Janice.

 

 

 

 

 

 

 

Aborted Module Name:   ADMSPROS.SRTLOAD_01

 

  Date:        Day:      Time:          Resolution:

09/29/09     Tue        10:30         See note from Janice below.

 

Error log and follow up comments:

 

this_report_title=.Electronic_Prospect_Load

+ eprint_report=ADMSPROS.SRTLOAD_01.Electronic_Prospect_Load

+ [[ -s /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis ]]

+ [[ ! -s /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis ]]

+ print -n -- \n\fREPORT     :  ADMSPROS.SRTLOAD_01.Electronic_Prospect_Load\n\f

+ 1>> /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis

+ cat /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

+ 1>> /ais01/ftp/to/eprint/ADMSPROS_tempdir/ADMSPROS.SRTLOAD_01.lis

+ [ 139 -eq 0 ]

 

For future reference, it would be helpful if IT Scheduling can include the type of feedback which I’ve sent this morning for the other SRTLOAD aborts and include the functional analyst as an email recipient.  Note the correct excerpts from the various logs shown below for this latest ADMSPROS.SRTLOAD abort.

From  the Banner log (ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.log):

file name /ais01/dat/work/prod/d3370022

Missing DOB field for record with name of

 Rozana Beluts and SSN of 

Missing DOB field for record with name of

 Grant Hinkle and SSN of 

Missing DOB field for record with name of

 Kevin Nguyen and SSN of 

Missing DOB field for record with name of

 Ashley Sharpe and SSN of 

Missing DOB field for record with name of

 Bobby Torandaz and SSN of 

Missing DOB field for record with name of

 Michael Venter and SSN of 

Missing DOB field for record with name of

 Hamel Winter and SSN of 

srtload completed successfully

609 lines written to /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

TOTAL EXECUTION TIME IN SECONDS: 22

TOTAL EXECUTION TIME IN MINUTES: 0.367

From the Banner lis (ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis)

Number of Records Read from Tape: 216

Total of Prospects Loaded : 216

Total of PIDMs Matched : 64

Total of Conversion Errors : 7

From the joblog (ADMSPROS.SRTLOAD_01.3370022.3370029.00.2009_09_29_1029.AWPROD.LOG):

+ /app/sct/banprod/general/exe/srtload -f -o /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.lis

+ cat /Applications Manager/run/temppar.4708342

+ 1>> /Applications Manager/out/ADMSPROS.SRTLOAD_01.3370022.3370029.00.1681460.log 2>& 1

/Applications Manager/exec/AW_BANNER[414]: 770394 Memory fault(coredump)

+ err=139

+ rm -f /Applications Manager/run/temppar.4708342

+ 1> /dev/null 2>& 1

+ [ 139 != 0 ]

+ echo Non-zero error generated from running job. The program srtload failed to run successfully.

Non-zero error generated from running job. The program srtload failed to run successfully.

Janice.

 

 

 

 

Aborted Module Name:   ADMSAPPL.ADMSS484_01

 

  Date:        Day:      Time:          Resolution:

12/13/10     Mon       22:12         Restarted by ITS.

 

Error log and follow up comments:

 

 

+ print *** \n*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ egrep -v -f /ais01/dat/misc/prod/errstrg_sql_ORA_ok

+ egrep -f /ais01/dat/misc/prod/errstrg_sql /appworx/out/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212.AWPROD.LOG

+ 1>> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS \n***

+ 1>> /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

+ cat /ais01/dat/work/prod/ADMSAPPL.ADMSS484_01.5502469.5502474.00.2010_12_13_2212_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 660

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

It is helpful to have some of the information before the actual error.  In this case, I went back to the file in /ais01/joblog to get the output before the error.  This tells me the person the sql was processing when the error occurred.  So, this portion of the file:

Aidm 476631

sarrqst_cnt: 14

Before residency

Not all 3 Residency questions answered 'Y'

res_code: 0 res_claim: Y res_nonzip:  res_zip: 80919 Application level is UG Major to be inserted ELEG-BMEE-BS declare

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 660

Is more useful to us than just the error and the line number.

Bev.

 

I believe that the issue related to the data is solved. Please run the chain again.

Rami.

 

 

 

Aborted Module Name:   KFSXPDSA.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

09/30/09     Wed       08:36         See notes from Janice below.

 

Error log and follow up comments:

Kevin is already working on fixing the email address problem and will let us know when the failed component can be restarted.

Pertinent error messages from the joblog:

009-09-30 08:10:03,509 [main] INFO  org.kuali.kfs.sys.batch.Job :: Executing step: pdpSendAchAdviceNotificationsStep=class org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep

2009-09-30 08:10:53,927 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <anne.hanika@colostate.edu>... User unknown

                ... 6 more

2009-09-30 08:10:53,966 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.00 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-30 08:10:53,966 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

                at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

                at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

                at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)   Janice.

This module aborted again. Dawn.

Another bad email address so it failed again:

2009-09-30 09:01:38,384 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <nicole.brennan@colostate.edu>... User unknown

2009-09-30 09:01:38,429 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.01 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-30 09:01:38,429 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

                at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

                at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

                at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67) Janice.

Another email showing the actual error message(s) for this failed component – IT Scheduling should include these type of error messages in their APPLICATIONS MANAGER-ABORT email(s), if possible .    Janice.

Just an F.Y.I. –- I went into the output file and was doing a “Find” looking for the errors Janice is showing in her email below, and Applications Manager froze up on me before I got to the errors she mentions.  This happened twice, and each time I had to close Applications Manager through Task Manager and re-login.  I finally found the errors she mentions by looking at the output file in /ais01/joblog in kebler using spf.  Steve.

That’s interesting – I was able to view those errors (and copy them) via the Applications Manager Output Files Viewer – ugh, just what we need – another unexplainable Applications Manager problem!  Instead of doing a “Find”, you might try just scrolling down through the output file – that’s how I got to the error messages, rather than “Find”ing them.   Janice.

 

 

Aborted Module Name:   ADMSGREL.SRTLOAD_01

 

  Date:        Day:      Time:          Resolution:

10/01/09     Thu        09:35          See follow up from Bev below.

 

Error log and follow up comments:

 

Parameter 01 /ais01/dat/work/prod/d3385536 Read from Job Submission

Parameter 02 GRE Read from Job Submission

Parameter 05 G Read from Job Submission

Parameter 15 GRE Read from Job Submission

Parameter 22 U Read from Job Submission

PREL Code=GRE and TAPE=GRE Interface Code GRE Contact Code XTS Source A00005

MAJOR2INTEREST Function N

Valid Tape Code=GRE GRE Test Score Tape

Parameter 07 GR Read from Job Submission

Parameter 99 55 Read from Job Submission

Parameter 03 was not found in Job Submission

Parameter 04 was not found in Job Submission

Parameter 06 was not found in Job Submission

Parameter 08 M Read from Job Submission

Parameter 09 was not found in Job Submission

Parameter 10 was not found in Job Submission

Parameter 11 was not found in Job Submission

Parameter 12 XTS Read from Job Submission

Parameter 13 was not found in Job Submission

Parameter 14 MA Read from Job Submission

Parameter 16 A Read from Job Submission

Parameter 17 EADM Read from Job Submission

Parameter 18 N Read from Job Submission

Parameter 19 was not found in Job Submission

Parameter 20 N Read from Job Submission

Parameter 21 Y Read from Job Submission

Parameter 23 was not found in Job Submission

Parameter 24 was not found in Job Submission

file name /ais01/dat/work/prod/d3385536

*ERROR* INTERNAL FIELD TABLE SRTTPFD_ROW SIZED FOR 1000 FIELDS.

RESIZE STRUCT SRTTPFD_ROW TO ALLOW MORE ENTRIES.

srtload terminated with error

121 lines written to /Applications Manager/out/ADMSGREL.SRTLOAD_01.3385536.3385539.00.1683644.lis

 

I found the exact error on UDC as a Defect.  I logged a critical Service Request with Sungard, because the solution is part of 8.3. 

I received a call from Sungard.  It has been escalated.  We are the third school to encounter this problem.

For now, we have no resolution.  ………Bev

 

This problem is unique to GRE loads.  The other SRLOADS appear to be fixed with the patch Mark applied.

Not running ADMSGREL until there’s a solution would be good……………Bev

 

Marcella and I tested this (SRTLOAD.pc) with two different loads and it now appears to be working correctly.

Bev.

 

Hi Robin, this issue was resolved on Monday.  It is ok to run GRE jobs in Applications Manager now.

Marcella.

 

 

 

Aborted Module Name:   HRMSFLX_SAL.HRMSS138_01

 

  Date:        Day:      Time:          Resolution:

10/23/09     Fri         11:45            See note from Janice below..

         

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-01821: date format not recognized

ORA-06512: at line 287

 

11:40:54 287       Select substr(address_line1, 1, 30)

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

Another bad email address so it failed again:

2009-09-30 09:01:38,384 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <nicole.brennan@colostate.edu>... User unknown

;

2009-09-30 09:01:38,429 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXPDSA.pdpSendAchAdviceNotificationsStep.3378892.3378921.01 steps: [pdpSendAchAdviceNotificationsStep]

2009-09-30 09:01:38,429 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Could not send email to advice return email address on customer profile: BFS_AcctPay@mail.colostate.edu

                at org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl.sendAchAdviceEmail(PdpEmailServiceImpl.java:560)

                at org.kuali.kfs.pdp.batch.service.impl.AchAdviceNotificationServiceImpl.sendAdviceNotifications(AchAdviceNotificationServiceImpl.java:56)

                at org.kuali.kfs.pdp.batch.SendAchAdviceNotificationsStep.execute(SendAchAdviceNotificationsStep.java:38)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

 

Another email showing the actual error message(s) for this failed component – IT Scheduling should include these type of error messages in their APPLICATIONS MANAGER-ABORT email(s), if possible …………………………Janice.

 

HRMSFLX_SAL.HRMSS138_01 is complete.

David.

 

 

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

 

  Date:        Day:      Time:          Resolution:

10/27/09     Tue        19:26            Restarted by Janice.

Error log and follow up comments:

 

Janice,

Debbie asked me to talk to you about this error.  Three of us looked for the error and we did not find what you found.  Do you have time to show me how you found this error?

 

I usually open the output listing and go to the bottom, then just start scrolling up until you see something like this:

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

 

Then keep scrolling up a bit and you’ll see the error and section of the output log which I emailed.  I usually just do the scrolling, rather than searching for a word like “error”.    In the past, it seems that Applications Manager hanging might be related to searching output files – maybe it’s just coincidence, but I can recall a few times when IT Scheduling was doing the find command in an output file and Applications Manager became non-responsive – so I usually just scroll through the listing instead.

 

2009-11-03 19:27:10,511 [main] ERROR org.kuali.rice.kew.util.XmlHelper :: Error accessing method 'getGroup' of instance of class org.kuali.rice.kew.actionitem.ActionItem

2009-11-03 19:27:10,624 [main] INFO  org.kuali.rice.kew.engine.StandardWorkflowEngine :: Successfully processed document: 482781 : null

2009-11-03 19:27:17,732 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.3527835.3527843.00 steps: [procurementCardRouteDocumentsStep]

2009-11-03 19:27:17,732 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

                at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

                at org.objectweb.jotm.Current.commit(Current.java:488)

                at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

                at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:311)

                at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:117)

                at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

                at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:203)

                at $Proxy172.routeProcurementCardDocuments(Unknown Source)

                at org.kuali.kfs.fp.batch.ProcurementCardRouteDocumentsStep.execute(ProcurementCardRouteDocumentsStep.java:42)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

 

Please run KFSXFPPC…………..Kevin.

 

I restarted the failed KFSXFPPC.KFSX_JAVA_03 chain component……….Janice.

 

 

 

 

Aborted Module Name:   LAUNCH/DB ERRORS

 

  Date:        Day:      Time:          Resolution:

11/03/09     Tue       10:00-1230   See note from Janice below.

 

 

Error log and follow up comments:

 

In the past, the sole purpose of the APWXCHCK_HOURLY_SYSTEM_CHECK chain was to  "touch" /ais01/dat/apwx/prod/APWXCHCK_HOURLY.DAT, the Kebler file which the Enterprise Manager process monitors to determine if Applications Manager may have stalled and needs to be recycled.  Although the majority of LAUNCH and DB ERRORS are connected to java hanging/Applications Manager problems, we occasionally encounter these errors for other reasons. 

The /ais01/dat/apwx/prod/APWXCHCK_HOURLY.DAT file, coupled with the Enterprise Manager process, will not detect such errors and therefore a separate process was developed.  While there may be some duplication of reporting when LAUNCH/DB ERRORS are related to java hanging/Applications Manager problems, the separate process to check backlog for such errors is the only way to detect and report on *ALL* LAUNCH/DB ERRORS.

 

Below please find a sample of the email which will be sent from the APWXCHCK_HOURLY_SYSTEM_CHECK chain if any chain components are found in backlog in either LAUNCH ERROR or DB ERROR status.   The process to check backlog is performed via the APWXCHK_BACKLOG module, which runs the underlying APWXCHK_BACKLOG report, which I created for this purpose via the Applications Manager Reports feature.  Via an AFTER condition of the APWXCHK_BACKLOG module, the SEND_MAIL module will be requested to email the APWXCHK_BACKLOG report **if** the size of the output report file is indicative that detail records are present on the report.  The recipients for this email are contained within the mailing list file,  /ais01/dat/misc/mailst/SEND_MAIL.CRITFAIL.LST, which currently contains:

 

970-226-7550@PAGE.METROCALL.COM                                                

Jan.Mueller@ColoState.EDU                                                      

Janice.Wilkinson@ColoState.EDU                                                 

David.Peterson@ColoState.EDU                                                   

IT_scheduling@mailer.is.colostate.edu 

 

The same mailing list file, /ais01/dat/misc/mailst/SEND_MAIL.CRITFAIL.LST, is also utilized for the SEND_MAIL component which is spawned via the COMPLETION script for Critical Chain Component Aborts.

 

-----Original Message-----

From: Applications Manager@Kebler.is.colostate.edu [mailto:Applications Manager@Kebler.is.colostate.edu]

Sent: Tuesday, November 03, 2009 11:16 AM

Cc: Wilkinson,Janice

Subject: LAUNCH/DB ERRORS

 

Tue Nov 03 11:15:35 MST 2009                  Page 1

        Check Backlog for Launch or DB Errors      

Status Name  Module                            Jobid

------------ ---------------------------- ----------

DB ERROR     APWXTST0_EX.WAIT_FOR_COND_01    3526200

LAUNCH ERROR APWXTST0_EX.PAYRPNAC_01      3526201.01

 

P.S.  Just a reminder that the Enterprise Manager process will continue to send the page to Greg's pager number when Applications Manager has been recycled.

Janice.

 

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_03

 

  Date:        Day:      Time:          Resolution:

11/04/09     Wed       19:27           See note from Janice below.

 

Error log and follow up comments:

Janice,

 

Debbie asked me to talk to you about this error.  Three of us looked for the error and we did not find what you found.  Do you have time to show me how you found this error?

 

I usually open the output listing and go to the bottom, then just start scrolling up until you see something like this:

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

 

Then keep scrolling up a bit and you’ll see the error and section of the output log which I emailed.  I usually just do the scrolling, rather than searching for a word like “error”.    In the past, it seems that Applications Manager hanging might be related to searching output files – maybe it’s just coincidence, but I can recall a few times when IT Scheduling was doing the find command in an output file and Applications Manager became non-responsive – so I usually just scroll through the listing instead.

 

2009-11-03 19:27:10,511 [main] ERROR org.kuali.rice.kew.util.XmlHelper :: Error accessing method 'getGroup' of instance of class org.kuali.rice.kew.actionitem.ActionItem

2009-11-03 19:27:10,624 [main] INFO  org.kuali.rice.kew.engine.StandardWorkflowEngine :: Successfully processed document: 482781 : null

2009-11-03 19:27:17,732 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.3527835.3527843.00 steps: [procurementCardRouteDocumentsStep]

2009-11-03 19:27:17,732 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

                at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

                at org.objectweb.jotm.Current.commit(Current.java:488)

                at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

                at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:311)

                at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:117)

                at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)

                at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:203)

                at $Proxy172.routeProcurementCardDocuments(Unknown Source)

                at org.kuali.kfs.fp.batch.ProcurementCardRouteDocumentsStep.execute(ProcurementCardRouteDocumentsStep.java:42)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

 

Please run KFSXFPPC…………..Kevin.

 

I restarted the failed KFSXFPPC.KFSX_JAVA_03 chain component……….Janice.

 

 

 

 

Aborted Module Name:  HRMSDED_SAL.HRMSRPTS-LOOP_01

 

  Date:        Day:      Time:          Resolution:

11/19/09     Thu       13:12           See follow up notes below.

 

Error log and follow up comments:

 

+ . /Applications Manager/exec/COMPLETION

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 3597011

      3597011.00 BATCH     HRMSDED_SAL.HRMSR00211/19 13:24 00:20:35 C-Error     APPLICATIONS MANAGER    HRMSDED_DEDUCTION_REPORTS

+ print Failure in spawned HRMSR002 - abort HRMSRPTS-LOOP

Failure in spawned HRMSR002 - abort HRMSRPTS-LOOP

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

Debra called me on this failure, so I talked with Steve Hill.  Steve was unaware of this failure until he received a Clarity task yesterday PM to fix a bug in the report.  The report was modified and is waiting on Chris Domanik in HR to validate the report has been corrected.  Once corrected, he will generate a turnover and let ITS know that the job can be restarted.  Janice or Jan may be able to weigh in on the restart of this process..……….Ken

 

 

As Ken indicated in his email, the proposed solution for the failed HRMSR002 is waiting for approval from Chris.  There  are a total of 20 entries in the salary report driver file, /ais01/dat/hrms/prod/HRMSRPTS_SALARY_DRIVER.  Two of those entries are for HRMSR002 – one to produce a download file and another to produce hard-copy report.   While we wait for approval regarding the HRMSR002 solution,   I’ve moved the two HRMSR002 entries to the end of the driver file so we can proceed with other reports that are to run and  I’ve restarted the failed HRMSDED_SAL.HRMSRPTS-LOOP_01.  When the HRMSRPTS-LOOP reaches the HRMSR002 entries in the driver file, the spawned HRMSR002 will fail again unless the solution has already been implemented into production. If HRMSR002 and/or any other of the spawned reports fail, please be sure that notification is sent to Steve Hill in addition to the apwx_maint mailing list.

Janice.

 

 

 

 

 

 

Aborted Module Name:   KFSXOFF_D1.KFSX_OFFLINE_01

 

  Date:        Day:      Time:          Resolution:

11/19/09     Thu         19:10           Restarted by Janice.

 

Error log and follow up comments:

 

 

19 19:10:51-Child:Done

19 19:10:55-Parent: (4)Checking child process(413908)

19 19:10:55-Parent: Child process[413908] done.

19 19:10:55-Parent: Checking child mem

19 19:10:55-Parent: Value in mem [1]

19 19:10:55-Parent: Child process returned a value.

19 19:10:55-Parent: child process done.

19 19:10:55-Parent:Value in mem [1]

19 19:10:55-Deleting kill file if exists [/Applications Manager/run/jobpid.3599633.00]

19 19:10:55-Deleting flag file if exists [/Applications Manager/run/jobpid.3599633.00]

19 19:10:55-Getting env 'SURUNEXIT'

19 19:10:55-Null

Exiting with su job error code[1].

error is 1

 

Last night’s KFSX schedule stalled due to an abort in the KFSXOFF.KFSX_OFFLINE_01 component of the KFSXOFF_D1_ONLINE_OFF_JOB1 chain.   The KFSX offline/online processes perform  remote shell execution (on Malta) of the scripts to shutdown Tomcat/startup Tomcat respectively.  The logs(feedback) from the remote shell executions are written to the /ais02/log directory on a NSF mounted volume to allow the controlling KFSX_OFFLINE and KFSX_ONLINE components, which run on Kebler, to examine the logs and determine if the remote shell processes were successful.  Occasionally, with NSF mounted volumes, we have experienced a delay in availability of a file back on Kebler – resulting in either an empty file or partially complete file being examined rather than the complete file.  Primarily, we’ve seen this type of delay with utl files from sql executions and until last night had not experienced this type of delay for remote shell execution logs.   While the remote shell execution to shutdown Tomcat actually was successful last night, an incomplete version of the associated log file was examined by the controlling KFSX_OFFLINE script on Kebler.  Consequently, it appeared to the KFSX_OFFLINE script that the Tomcat shutdown was NOT successful, which caused the KFSXOFF.KFSX_OFFLINE_01 component to abort. 

 

To prevent a reoccurrence of this situation in the future, I have made the following changes:

·       Added a sleep command in the KFSX_OFFLINE.KSH  and KFSX_ONLINE.KSH scripts to introduce a delay between the rsh (remote shell execution) command and the subsequent process which attempts to examine the remote shell log file on the NSF mounted volume /ais02/log directory.    

·       The following KFSX offline/online chains are now critical chains so any failures within these chains will result in a page to IT Scheduling Oncall Staff, who will then contact the appropriate IS staff to follow-up:

KFSXOFF_D1_ONLINE_OFF_JOB1            KFSX Daily Online Off Job #1 (Shutdown Tomcat)

KFSXOFF_D2_ONLINE_OFF_JOB2            KFSX Daily Online Off Job #2 (Shutdown Tomcat)

KFSXON_D1_ONLINE_ON_JOB1               KFSX Daily Online On Job #1(Startup Tomcat)

KFSXON_D2_ONLINE_ON_JOB2               KFSX Daily Online On Job #2(Startup Tomcat)

 

Janice.

 

 

 

Aborted Module Name:   KFSXFPPC_FP_PCARD_DOCUMENTS

 

  Date:        Day:      Time:          Resolution:

11/30/09     Mon       09:00           Restarted by Jan.

 

Error log and follow up comments:

 

org.kuali.rice.kew.actionitem.ActionItem

2009-11-24 19:18:17,384 [main] ERROR org.kuali.rice.kew.mail.service.impl.ActionListEmailServiceImpl :: Error sending Action Li

st email.

org.kuali.rice.kew.exception.WorkflowRuntimeException: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Alissa.Gigliotti@colostate.edu>... User unknown

        at org.kuali.rice.kew.mail.service.impl.DefaultEmailService.sendEmail(DefaultEmailService.java:73)

        at sun.reflect.GeneratedMethodAccessor536.invoke(Unknown Source)

 

I updated the email to akgigliotti@hotmail.com

Could you rerun the job?

John Hunter.

 

 

 

 

 

 

Aborted Module Name:   HRMSS041.HRMSS041_01

 

  Date:        Day:      Time:          Resolution:

11/30/09     Mon       20:16           Restarted by David.

 

Error log and follow up comments:

 

 

20:15:37 534 

20:15:37 535                         if l_run_date between c2.cvg_strt_dt and c2.cvg_thru_dt then

20:15:37 536  --  dbms_output.put_line(c1.person_id||','|| c1.prtt_enrt_rslt_id); --used for error testing

20:15:37 537                            -- Member Level Detail segment

20:15:37 538                            l_segment := csuh_edi_834_pkg.edi_ins(

20:15:37 539                                                            p_ins01 => 'N'

20:15:37 540                                                           ,p_ins02 => c2.contact_type_code

20:15:37 541                                                           ,p_ins03 => '030'

20:15:37 542                                                           ,p_ins04 => '20'

20:15:37 543                                                           ,p_ins05 => 'A');

20:15:37 544                            write_segment(ws_file_handle, l_segment,l_segment_count);

20:15:37 545 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 538

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

This failed because there are several records out there again that have “BEN” as contact type that shouldn’t.

Jan.

 

Chris,

Aren’t your people running that script daily and fixing up the records?

-Bob-

 

I’ve included HRMSS041 as an exception so the HRMSAW99 should finish in about 5 minutes when it wakes up again to check backlog.  Once that has completed, IT Scheduling may release staging which will allow HRMSAW15 to be staged in.  HRMSS041 has no successors, so if we need to let it hang until tomorrow, that shouldn’t cause any problems with tonight’s HRMS schedule.

Janice.

 

HRMSS041 is complete.

David.

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

12/03/09     Thu       15:03           Restarted by ITS.

 

Error log and follow up comments:

 

2009-12-03 15:03:32,138 [main] INFO  edu.csu.batch.service.RunBatch :: Executing job: KFSXFPPC.procurementCardLoadStep.3650673.3650674.00 steps: [procurementCardLoadStep]

2009-12-03 15:03:32,201 [main] INFO  org.kuali.kfs.sys.batch.Job :: Started processing step: 0=procurementCardLoadStep

2009-12-03 15:03:32,209 [main] INFO  org.kuali.kfs.sys.batch.Job :: Creating user session for step: procurementCardLoadStep=kfs

2009-12-03 15:03:32,395 [main] INFO  org.kuali.kfs.sys.batch.Job :: Executing step: procurementCardLoadStep=class org.kuali.kfs.fp.batch.ProcurementCardLoadStep

2009-12-03 15:03:33,073 [main] ERROR org.kuali.kfs.sys.exception.XmlErrorHandler :: error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

2009-12-03 15:03:33,157 [main] ERROR org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl :: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

2009-12-03 15:03:33,160 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardLoadStep.3650673.3650674.00 steps: [procurementCardLoadStep]

2009-12-03 15:03:33,160 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

                at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:75)

                at org.kuali.kfs.fp.batch.ProcurementCardLoadStep.execute(ProcurementCardLoadStep.java:54)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Caused by: org.kuali.kfs.sys.exception.XMLParseException: error Parsing error was encountered on line 5928, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

                at org.kuali.kfs.sys.exception.XmlErrorHandler.error(XmlErrorHandler.java:42)

                at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source)

David.

 

The last pcard transaction in the input file was missing the Chart Of Accounts code of CO for account 1306230. I updated this, so can someone please re-run. I’ll contact John Swaro to update the default COA.

John Walker.

 

 

 

Aborted Module Name:  EIDSUPDT.EIDSS002_01

 

  Date:        Day:      Time:          Resolution:

06/14/11     Tue         22:38          Restarted by Joleen.

 

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 23

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

I'm forwarding this to Phil since neither of the individuals on the EIDS Alert list are here today!!

Janice.

 

I couldn't figure out any obvious cause for the numeric value error. Please try to run this again, and if we still have the error, I will update the program to output some data.

Rami.

 

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

12/10/09     Thu       14:06            Restarted.

Error log and follow up comments:

 

 

 

2009-12-10 14:06:00,421 [main] INFO  org.kuali.rice.kew.docsearch.SearchableAttribute :: ...finished indexing

document 523858 for document search.

2009-12-10 14:06:00,616 [main] ERROR org.apache.ojb.broker.accesslayer.JdbcAccessImpl ::

* SQLException during execution of sql-statement:

* sql statement was 'INSERT INTO PDP_PMT_NTE_TXT_T (PMT_NTE_ID,CUST_NTE_LN_NBR,CUST_NTE_TXT,LST_UPDT_TS,VER_NBR,PMT_DTL_ID,OBJ_ID) VALUES (?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.kfs.pdp.businessobject.PaymentNoteText'

* PK of the target object is [id=10071760]

* Source object: paymentNoteText(id)=(10071760)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

                at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

                at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:331)

                at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:288)

                at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:745)

                at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:216)

                at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:966)

 

 

Shawn,

A job failed due to a crazy character in the DV check stub field for 2 DV docs. Can you run the attached script in kfsprd to correct (remove) the crazy characters?

Please let David/Apwx_maint know when you are done, so that they can re-run the job.

John Walker.

 

 

Aborted Module Name:   HRMSDAY1.HRMSS007_01

 

  Date:        Day:      Time:          Resolution:

12/16/09     Wed      07:15            See note from Janice below.

09/15/10     Wed      20:20            Restarted by ITS.

 

Error log and follow up comments:

 

12/16/09.

Both HRMSS007 and HRMSS009 failed with:

ERROR at line 1:

ORA-12541: TNS:no listener

which I believe was caused by problems with ODS or this link to ODS.   Both of these sqls use the csug_gp_demo_v view, which selects data from csuban.csug_gp_demo@odsprod.world

When I attempt via an sql (on hrprod) to just select count using this link to odsprod, the same TNS no listener error is produced:

07:20:07 SQL> select count(*) from csuban.csug_gp_demo@odsprod.world

07:22:49   2  /

select count(*) from csuban.csug_gp_demo@odsprod.world

ERROR at line 1:

ORA-12541: TNS:no listener

 

09/15/10.

ERROR at line 23:

ORA-06550: line 23, column 17:

PL/SQL: ORA-00904: "CSUG_GP"."MULTI_RACE_IND": invalid identifier

ORA-06550: line 8, column 1:

PL/SQL: SQL Statement ignored

ORA-06550: line 184, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 184, column 4:

PL/SQL: Statement ignored

ORA-06550: line 185, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 185, column 4:

PL/SQL: Statement ignored

ORA-06550: line 186, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

ORA-06550: line 186, column 4:

PL/SQL: Statement ignored

ORA-06550: line 192, column 37:

PLS-00364: loop index variable 'STUDENT_ETHNIC_REC' use is invalid

 

This process can be restarted.  I restored the old script and put it in the /ais01/src/sql/temp directory.  This file can be flagged to stay in this directory for 1 week.  If I need to contact someone else concerning the file in the temp directory please let me know.

Steve H.

 

 

 

Aborted Module Name:   KFSXGLPO_D1.KFSX_JAVA_02

 

  Date:        Day:      Time:          Resolution:

12/15/09     Tue        19:50            Restarted by Jan.

 

Error log and follow up comments:

 

 

java.lang.RuntimeException: PosterServiceImpl Stopped: AbstractUpdatingPreparedStatementCachingDaoJdbc.UpdatingJdbcWrapper encountered exception during getObject method for type: class org.kuali.kfs.gl.businessobject.Entry

              at org.kuali.kfs.gl.batch.service.impl.PosterServiceImpl.postTransaction(PosterServiceImpl.java:460)

at org.enhydra.jdbc.core.CorePreparedStatement.executeUpdate(CorePreparedStatement.java:102)

              at org.kuali.kfs.sys.batch.dataaccess.impl.AbstractPreparedStatementCachingDaoJdbc$JdbcWrapper.update(AbstractPreparedStatementCachingDaoJdbc.java:39)

              ... 87 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

 

KFSX Follow-up Summary:

Since the KFSXGLPO_D1.KFSX_JAVA_02 failure was within the KFSX Online off (Tomcat down) window,   Tomcat should have remained down until all processing scheduled to occur within the KFSX Online off window had completed this morning.     Although I was not directly informed of any decision to bring Tomcat up (KFSX Online access back on),  I discovered that indeed it was manually started this morning thereby allowing users online access to KFS.  This posed a dilemma as we still had not only the KFSXGL_D1_DAILY_GL_UPDT_JOB1 update chain (which had the KFSXGLPO_D1.KFSX_JAVA_02 failure) which needed to complete, but also several other chains – all of which SHOULD be running with Tomcat down (i.e. no KFSX users on the system).   Additionally, the 2nd GL update chain, KFSXGL_D2_DAILY_GL_UPDT_JOB2,  to post encumbrances had not yet run.  If we allowed encumbrance posting process  to run as per design, it  would shutdown Tomcat, run the scrubber and poster subchains, and then bring Tomcat back up. 

Although it obviously would have been best to not bring Tomcat up this morning, we already had users back in the system, creating documents, etc. and we had to proceed by determining how to minimize the adverse impact.  Kevin and I discussed which chains were left to run and decided that although not the ideal scenario, the adverse impact would be less by leaving Tomcat up and bypassing the encumbrance updates.  To accomplish that, the following actions were taken:

1)        Deleted KFSXCS20.KFSXS001_01 (so encumbrance feed for KFSXGL_D2 would not be created)

2)        Deleted KFSXCS20.NOTIFY_FOR_APWX_01 (so notify for KFSXGL_D2 would not be created).

3)        Placed a hold on KFSXGLEF_D2.KFSX_JAVA_01 so that output from the KFSXGLEF_D2.COLLECT_FILES_01 component could be checked to verify that no data was collected.  Once that verification was performed, this chain component was released to run.

In the future, when there are production job aborts within the Tomcat down (KFSX Online Access Off) window, Tomcat will be down in the morning and should remain down until the problems are resolved.  In today’s case, we were lucky that the update  chain failed far enough into the process that having users back online was not as problematic.  However, had the failure been earlier within the chain, that may not have been the case.   It is extremely important that consultation regarding status of KFSX Applications Manager production schedule occur prior to making decisions regarding whether Tomcat should be manually started. Let me know if you have questions.

Janice.

 

We did have a document go into exception this morning due to a user updating a PO (entering receiving information ?) at the same time the Auto Close PO job was updating the PO.  John H was able to approve the document.

Kevin.

 

 

 

 

 

Aborted Module Name:   APMXLOOK_AM.APMXLOOK_01

 

  Date:        Day:      Time:          Resolution:

06/04/10     Fri          08:18           Restarted by Jan.

 

Error log and follow up comments:

 

+ 1>> /ais01/dat/work/prod/APWXLOOK_AM.APWXLOOK_01_jobstat

+ cat /ais01/dat/work/prod/APWXLOOK_AM.APWXLOOK_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

grep: 0652-033 Cannot open /ais01/src/sql/temp/APWXLOOK_OK.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

 

I created /ais01/src/sql/temp/APWXLOOK_OK and restarted APWXLOOK_AM.APWXLOOK_01, complete.

Jan

 

 

 

 

Aborted Module Name:   FAIDEPLS_EV.LYNX_01

 

  Date:        Day:      Time:          Resolution:

07/10/13     Wed       15:01          Restarted by Joleen.

 

Error log and follow up comments:

 

 

 [OracleException]: ORA-20100: *Error* in call to rp_award.p_update: ORA-06502: PL/SQL: numeric or value error: NULL index table key value

ORA-06512: at &quot;BANINST1.RP_AWARD&quot;, line 1840

ORA-06512: at &quot;BANINST1.RP_AWARD&quot;, line 1895

 

Karma asked me to bypass the aborted LYNX step and let the rest of the process flow run. FAIDEPLS_E-PLUS has finished running.

Joleen.

 

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

12/28/09     Mon       15:12           Restarted by ITS.

 

Error log and follow up comments:

 

RunBatch ERROR: Exception found:

java.lang.RuntimeException: Error parsing xml error Parsing error was encountered on line 3209, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

                at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:75)

                at org.kuali.kfs.fp.batch.ProcurementCardLoadStep.execute(ProcurementCardLoadStep.java:54)

                at org.kuali.kfs.sys.batch.Job.runStep(Job.java:156)

                at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:74)

                at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

Caused by: org.kuali.kfs.sys.exception.XMLParseException: error Parsing error was encountered on line 3209, column 44: cvc-minLength-valid: Value '' with length = '0' is not facet-valid with respect to minLength '2' for type 'twoCharType'.

            

Missing a chart code in KFSXFPPC.KFSXS008_01.utl_file2, KFSXFPPC.FCS3571B_20091223_143413SB_000350.cdf.xml.   I entered CO.  Please restart this step/job.

Kevin.

 

 

Aborted Module Name:   OSYSJOBS_04.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/05/10     Fri         16:31           Restarted by Jan.

 

Error log and follow up comments:

 

** REMOVE log FILES OLDER THAN 30 DAYS

***

<#/ais02/job/temp/sys_purg_rsh.ksh.805#> cat /ais02/dat/work/prod/OSYSJOBS_04.OSYSPURG_01.4028488.4028492.00_too_old

<#/ais02/job/temp/sys_purg_rsh.ksh.805#> xargs -n25 rm -ef

rm: Removing ./access_log.1265068800

rm: Removing ./error_log.1265068800

rm: Removing ./ssl_request_log.1265068800

 

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

 

It is difficult finding errors in all the output, I had a tuff time myself.

But this is why the OSYSJOBS_04.OSYSPURG_01 jobs failed.

The alm_orautl filesystem was not mounted on Kebler for some reason and it could not perform the find command. I mounted the filesystem.

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> find /alm_orautl/hrdevl/ -type f -mtime +7 -print

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> 1>> /ais02/dat/work/prod/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00_too_old

find: 0652-010 The starting directory is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.916#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.78#> [[ 1 > 0 ]]

<#errtrap_rsh.78#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_rsh.7>> exit 1

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ cut -f 2 -d =

+ grep SCRIPT ABORTED

rsh_return_code=1

+ rm -ef /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

rm: Removing /ais02/log/OSYSJOBS_04.OSYSPURG_01.4033088.4033092.00.2010_03_07_1630.log

+ print *** \n*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1  \n*** EXIT  WITH EXIT CODE=1  \n***

***

*** RSH EXECUTED SCRIPT sys_purg_rsh.ksh EXIT CODE=1 

*** EXIT  WITH EXIT CODE=1 

Rich.

 

 

 

 

 

Aborted Module Name:   KFSXON_D1

 

  Date:        Day:      Time:          Resolution:

01/04/10     Mon       13:30           See note from Janice below.

 

 

Error log and follow up comments:

 

KFSXON_D1 is failing with the permission error below:

 

+ rsh_logname=/ais02/log/KFSXON_D1.KFSX_ONLINE_01.3761702.3761703.00.2010_01_04_1329.log

+ rsh Malta2 -l jobsys /ais02/job/prod/kshexe_rsh /usr/local/bin/startup_kfs_tomcat KFSXON_D1.KFSX_ONLINE_01.3761702.3761703.00 2010_01_04_1329 prd

rshd: 0826-813 Permission is denied.

+ exit 1

  Child: Job return = 1

David.

 

As per the recommendation that we should be using the more secure hostname of Malta2, rather than Malta, I modified the production KFSX Tomcat startup/shutdown chain components to use Malta2.  I tested the change on AWTEST, with Guffey2, which worked fine.  However, it was critical that the connection to Malta2 be tested before tonight's production so I also ran this KFSX_ONLINE test on AWPROD.  By the way, it is safe to test bringing Tomcat up against kfsprd because if it is already up, then the script just reports that fact.  This allows us a safe mechanism by which to verify that the rsh connection to the production host machine (Malta2) will function properly.  Even though the Guffey to Guffey2 change worked fine, the Malta to Malta2 change caused a permissions issue with the rsh command.  After discussing the problem with Ron, it was decided to modify the rsh to use root instead of jobsys which solved the problem with Malta2.

We should be "good to go" for tonight's production KFSX_OFFLINE/KFSX_ONLINE chains.

Janice.

 

 

 

 

Aborted Module Name:  FAIDCFEX_RC.SWPCOFE_01

 

  Date:        Day:      Time:          Resolution:

05/28/13     Tue       10:04            Restarted by David.

 

Error log and follow up comments:

 

 

Running SWPCOFE MC:9.0.5

... going through args k=1 arg=-f

... going through args k=2 arg=-o

... going through args k=3 arg=/appworx/out/swpcofe_3091227.lis

Username: Connected.

 

Run Sequence Number:

Encountered Abort Condition

Message is: ABORT: Reconcilation file locked for this term. File not processed.

 

 

Craig,

I’ve created incident I04902 to run an update script to reset the swrpass_recon_lock to null.

Please reply to Candy & David when this is complete.

1 row.

 

update swrpass

set swrpass_recon_lock =  null

where swrpass_term_code = '201310'

Phil.

 

I have run the update:

SQL>

  update swrpass

  set swrpass_recon_lock= null

  where swrpass_term_code = '201310'

SQL> /

 

1 row updated.

SQL> commit;

 

Commit complete.

 

Craig.

 

 

 

 

 

Aborted Module Name:   KFSXPDCH.SEND_MAIL_01

 

  Date:        Day:      Time:          Resolution:

06/08/11     Wed       14:35           Restarted by Janice.

 

Error log and follow up comments:

 

# - For pdp_check_20110608_142215.xml:

# -

# -    Bank 02 Count: 4

# -           Amount: $2200.00

# -    Start Bank 02 check disbursement number: 804520

# -

# -    Bank 05 Count: 10

# -           Amount: $14557.75

# -    Start Bank 05 check disbursement number: 128336

# -

# - ====================================================================

#------------------------------------------------------------------------------

# - Sending Message

#   MIME::Lite version  : 3.027

#   MAIL COMMAND        : smtp.colostate.edu , Debug => '0', Timeout => '60'

#   BUILDING HEADERS

#   BUILDING BODY

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

at /appworx/csu/exec/SENDMAIL.PL line 792

error is 255

===== Exiting PERL_CSU =====

+ err=255

 

As can be seen from the output log, the email was properly formatted but the module failed due to a connection problem to the mail server.   In this case, we do not want to “re-do” all the complicated conditions associated with the chain component  Therefore, I deleted all the conditions (in backlog) for the KFSXPDCH.SEND_MAIL_01 aborted component and then restarted the component.    Prior to restarting,  I also verified that the #WAIT_FOR_CHK_status_6391208 subvar contained the value of “checks” -  this value triggers the CHAIN_FINISH component of the KFSXPDCH_PDP_CHECKS_EXTR to request in the KFSXBURS_FT_TRANSFER_TO_BURSAR to transfer the checks to bursar’s server.  It was clear from the SEND_MAIL component output log that checks were produced – therefore the need to confirm that the #WAIT_FOR_CHK_status_6391208 had been properly populated.

Let me know if you have questions.

Janice.

 

 

 

 

Aborted Module Name:  AREGDRGC_SP.WAIT_FOR_DARS_01

 

  Date:        Day:      Time:          Resolution:

01/22/10     Fri          00:06          No output File - Deleted per Vickie.

 

Error log and follow up comments:

 

 

The WAIT_FOR_DARS component aborted because within the job_queue_list table, status=E for the particular jobid  (ba10012200041515) for which it was waiting. The status value must be “D”  for successful completion of the WAIT_FOR_DARS component.

Janice.

 

Jamie is taking care of a data condition.  So let’s delete the copy of AREGDRGC that aborted.

Denise Holcombe should be calling over to schedule it for tonight.Vicki.

 

 

 

 

Aborted Module Name:  HRMSS041.HRMSS041_01

 

  Date:        Day:      Time:          Resolution:

01/08/10     Fri         22:03           See notes below.

 

Error log and follow up comments:

 

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 675

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 791

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

 

22:02:17 791           raise_application_error(-20000, '**** FATAL ERROR! ****');

22:02:17 675                 l_segment := csuh_edi_834_pkg.edi_dmg(

 

 

HR has fixed the error. You may restart the job or whatever you do to complete this job and create the file.

IT scheduling, when you send out error messages, can you please be sure to add this type of information if it exists, as this is what tells us what the real error is and which record it was processing when it failed (see red section).

 

3384959,523712401,"Cutler, Zachary Lucas",19,01/01/2008

095549905,****,"Haas, Donald Edward",EMP,Y,01/01/2008

095549905,,"Haas, Donald Edward",EMP,01/01/2008

095620720,****,"Jones, David S",FAM,Y,01/01/2008

095620720,,"Jones, David S",FAM,01/01/2008

095620720,359648679,"Phelan, Jane P",01,01/01/2008

095620720,536334618,"Phelan-Jones, Savanna B",19,01/01/2008

096469970,****,"O'Grady, Pamela S",FAM,N,01/01/2010

096469970,,"O'Grady, Pamela S",FAM,01/01/2010

096469970,074501050,"O'Grady, Thomas",01,01/01/2010

096469970,001880393,"O'Grady, Brennan",19,01/01/2010

096469970,003886229,"O'Grady, Connor",19,01/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 675

Kathy.

 

 

 

Aborted Module Name:  KFSXFPPD.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

01/25/10     Mon       14:15          Restarted by ITS.

 

Error log and follow up comments:

 

010-01-25 14:09:09,129 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.3848111.3848131.00 steps: [disbursementVoucherPreDisbursementProcessorExtractStep]

2010-01-25 14:09:09,129 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springmodules.orm.ojb.OjbOperationException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

RunBatch ERROR: Exception found:

org.springmodules.orm.ojb.OjbOperationException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

Caused by: org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: disbursementVoucherDocumentdocumentHeaderId(documentNumber,versionNumber)=585020(585020,9)

                at org.apache.ojb.broker.accesslayer.JdbcAccessImpl.executeUpdate(JdbcAccessImpl.java:522)

                at org.apache.ojb.broker.core.PersistenceBrokerImpl.storeToDb(PersistenceBrokerImpl.java:1918)

                at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:886)

                at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:923)

                at org.apache.ojb.broker.core.PersistenceBrokerImpl.store(PersistenceBrokerImpl.java:793)

                at org.apache.ojb.broker.core.DelegatingPersistenceBroker.store(DelegatingPersistenceBroker.java:220)

                at org.apache.ojb.broker.core.DelegatingPersistenceBroker.store(DelegatingPersistenceBroker.java:220)

                at org.springmodules.orm.ojb.PersistenceBrokerTemplate$9.doInPersistenceBroker(PersistenceBrokerTemplate.java:246)

                at org.springmodules.orm.ojb.PersistenceBrokerTemplate.execute(PersistenceBrokerTemplate.java:141)

                at org.springmodules.orm.ojb.PersistenceBrokerTemplate.store(PersistenceBrokerTemplate.java:244)

                at org.kuali.rice.kns.dao.impl.DocumentDaoOjb.save(DocumentDaoOjb.java:62)

                at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java:674)

                at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocumentAndSaveAdHocRoutingRecipients(DocumentServiceImpl.java:350)

                at org.kuali.rice.kns.service.impl.DocumentServiceImpl.saveDocument(DocumentServiceImpl.java:121)

                at sun.reflect.GeneratedMethodAccessor372.invoke(Unknown Source)

 

 

Caused by: org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else:

Should be able to rerun the job.    Looks like Frank E Johnson was in the middle of  acknowledging this document (585020) at 2:06.

Kevin.

 

 

 

Aborted Module Name: KFSXFPPC.KFSX_JAVA_03

 

  Date:        Day:      Time:          Resolution:

01/25/10     Mon       19:18          Restarted by ITS.

 

Error log and follow up comments:

 

 

RunBatch ERROR: Exception found:

org.springframework.transaction.UnexpectedRollbackException: JTA transaction unexpectedly rolled back (maybe due to a timeout); nested exception is javax.transaction.RollbackException

Caused by: javax.transaction.RollbackException

                at org.objectweb.jotm.TransactionImpl.commit(TransactionImpl.java:245)

                at org.objectweb.jotm.Current.commit(Current.java:488)

                at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:842)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:651)

                at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:621)

               

 

Invalid email address, Lisa.Klopp@ColoState.EDU

I changed the email address to Purch_acard_help_desk@colostate.edu.  Please rerun the job.

Kevin.

Is Linda.Zafarna@colostate.edu also a problem?

 

/ais02/log $ grep 'User unknown' *3850859* |more

KFSXFPPC.KFSX_JAVA_03.3850859.3850867.00.2010_01_25_1900.log:   com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1

<Linda.Zafarana@colostate.edu>... User unknown

KFSXFPPC.KFSX_JAVA_03.3850859.3850867.00.2010_01_25_1900.log:   com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1

Jan.

 

I set the email addresses to Purch_pg_acard_help_desk@mail.colostate.edu.  Please run job.

Kevin.

 

On AWTEST/kfsdevl too?

Janice.

 

I updated kfsdevl for both of these invalid emails. Set to Purch_pg_acard_help_desk@mail.colostate.edu.

John Walker.

 

 

 

 

Aborted Module Name:   ADMSAPPL.ADMSS481_01

  Date:        Day:      Time:          Resolution:

01/11/11     Tue       22:39          Restarted by ITS.

10/28/13     Mon      22:26          Restarted by Joleen.

Error log and follow up comments:

 

01/11/11.

ERROR at line 1:

ORA-00001: unique constraint (SATURN.SABSUPL_KEY_INDEX) violated

ORA-06512: at line 958

 

22:39:15 958  insert into sabsupl

22:39:15 959  (sabsupl.sabsupl_pidm,

22:39:15 960   sabsupl.sabsupl_term_code_entry,

22:39:15 961   sabsupl.sabsupl_appl_no,

22:39:15 962   sabsupl.sabsupl_city_birth,

22:39:15 963   sabsupl.sabsupl_natn_code_birth)

22:39:15 964   VALUES

22:39:15 965   (supl_rec.sabiden_pidm,

22:39:15 966    supl_rec.saradap_term_code_entry,

22:39:15 967    supl_rec.saradap_appl_no,

22:39:15 968    substr(supl_rec.swrlcit_birth_city,1,20),

22:39:15 969    supl_rec.swrlcit_birth_natn);

22:39:15 970    dbms_output.put_line('adding sabsupl recs for '||v_id||' '||cnt);

22:39:15 971             cnt := cnt + 1;

22:39:15 972    end loop;

 

By the way, it looks like a lot of displays being produced by this program, making the output file rather large.  As an example, the applications "essays" appear to be echoed out - which of course can be quite lengthy and of questionable value for debugging purposes?

Janice.

 

Please restart the chain ..

The record that caused the unique constraint (SATURN.SABSUPL_KEY_INDEX) violated was deleted.

Rami.

 

10/28/13.

Similar error as reported 01/11/11.

 

The program was trying to add a sabsupl record when a record already existed.  This student had submitted multiple applications with different birth city records.  Janet Allen in Admissions deleted the existing sabsupl record.  I put a temp version of ADMSS481 on Kebler that excluded the aidms from the students prior applications.  The long term fix will be to include the run date criteria in the supl_cursor that is used in all of the other cursors - it's missing in this one.

Kathy.

 

 

 

Aborted Module Name:   ODSRKFSX.ODSRS002_01

 

  Date:        Day:      Time:          Resolution:

02/01/10     Mon       00:06          Restarted by ITS.

05/03/11     Tue        00:08          See follow up below.

 

Error log and follow up comments:

 

02/01/10.

23:42:14 149        csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_FP_DV_OWNR_TYP_T');

 

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_FP_DV_OWNR_TYP_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 149

 

I have commented out the mapping from the ODSRS002.  This mapping is no longer valid with the current version of KFS.

Mark.

 

05/03/11.

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_COFRS_DETAIL_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 140

 

There appears to be a tablespace issue.  One of the DBA's will have to look at it when they get in.

Here is the error on the database:

ORA-01652: unable to extend temp segment by  in tablespace

Mark.

 

 

 

 

Aborted Module Name:   KFSXPDSA.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

02/03/10     Wed       08:13          Restarted by Jan.

 

 

Error log and follow up comments:

 

 

2010-02-03 08:13:47,832 [main] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid

 email address. Sending message to BFS_AcctPay@mail.colostate.edu

org.kuali.rice.kns.mail.InvalidAddressException: org.springframework.mail.MailSendException; nested exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <cj.anderson@colostate.edu>... User unknown

;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <cj.anderson@colostate.edu>... User unknown

Jan.

 

I created Clarity incident I02405 to be assigned to a DBA to correct the invalid email and set to BFS_AcctPay@mail.colostate.edu.

Once the production table is updated, please notify IS Scheduling to re-run the job.

Someone in BFS should update the Payee ACH Account for Christopher Anderson and update his email address (to prevent future aborts).

 

update pdp_pmt_grp_t

set adv_email_addr = 'BFS_AcctPay@mail.colostate.edu'

where adv_email_addr = 'cj.anderson@colostate.edu'

and adv_email_snt_ts is null

Kevin.

 

If this is the standard sql statement which needs to be run for bad email addresses, maybe we could create an On-Request Applications Manager chain to run this sql, passing the “bad email address” into the sql as a parameter?  Then, when such problems occur, BFS could provide a Control Memo to IT Scheduling to request that the chain run and provide the bad email address parameter value.  Of course, BFS would still need to manually update the Payee ACH Account.

Just a thought – Janice.

 

There is a clarity task to fix the java program so it will send the email to bfs_acctpay and not fail.

Kevin.

 

Oh.. that’s even better!  Thanks for the update…..Janice.

 

The Check and ACH jobs are held up until this job completes.

Kevin.

 

This change has been made in KFSPRD…….Kelly.

 

 

 

 

 

Aborted Module Name:   FAIDCFIM_COF_IMPORT

 

  Date:        Day:      Time:          Resolution:

09/21/10     Tue        12:20          See notes below.

 

Error log and follow up comments:

 

Phil has just informed us that COF will not have a file available today for the FAIDCFIM_FA.COF_RESP_01 to process. 

While this component would eventually abort when it doesn’t find the file,  it would be best to simply handle the situation now.

 

Please proceed with the steps outlined below, in the order specified:

1)         Kill the FAIDCFIM_FA.COF_RESP_01 component – it should end up in KILLED status

2)        Delete all the chain components which are in PRED WAIT status, except for the FAIDCFIM_FA.CHAIN_FINISH_01 component.

My preference is to display the chain in backlog via Flow Diagram,  then select all the components to be deleted (in this case, FAIDCFIM_FA.DECRYPT_01 through FAIDCFIM_FA.VPLUS_RCAP-LOOP_01), then right click and select Delete 6 (the 6 indicates  you’ve selected six components to be deleted).

3)        Verify that all chain components which were deleted are in PW-DELETE status.

4)        Delete the “KILLED” FAIDCFIM_FA.COF_RESP_01 component.

5)        Verify that the FAIDCFIM_FA.CHAIN_FINISH_01 component finishes, thereby allowing FAIDCFIM_COF_IMPORT chain to complete.

 

On a more generic note, we often prefer to allow the CHAIN_FINISH chain component to run when we are deleting a chain that has started, but due to a failure or other reasons, is not to run to completion.  One of the key reasons is that the many chain specific subvars which have been defined for the chain will be deleted via the CHAIN_FINISH component, as well as other general cleanup of work files and so on.  However, it cannot be globally said that it would always be safe to run the CHAIN_FINISH component.  Therefore, research would be necessary to determine if the CHAIN_FINISH component (or its associated BEFORE/AFTER conditions) would be taking any action(s) which should NOT be performed.  As an example, the CHAIN_FINISH component of the FAIDCFEX_COF_EXPORT chain has an AFTER condition to request in the corresponding schedule of FAIDCFIM_COF_IMPORT.  Obviously, if we are attempting to delete remaining components of a FAIDCFEX_COF_EXPORT chain, we would NOT want this condition to be performed.  In this case, if we decide to let the CHAIN_FINISH component run, while deleting the remainder of the chain components, we would first have to disable the CHAIN_FINISH conditions to prevent them from running.  CHAIN_FINISH components also may have filenames specified for the “Files to backup”, “Files to empty”, or “Files to delete” prompts which we may not wish to backup, empty or delete.  In general, research is the key to safely allowing the CHAIN_FINISH component to run when deleting the remainder of the chain components.

Let me know if you have questions.

Janice.

 

 

 

 

Aborted Module Name:   HRMSS041.HRMSS041_01

 

  Date:        Day:      Time:          Resolution:

02/08/10     Mon       21:50          Restarted by ITS.

 

Error log and follow up comments:

 

21:50:51 676                    l_segment := csuh_edi_834_pkg.edi_dmg(

21:50:51 677                                            p_dmg01 => 'D8'

21:50:51 678                                           ,p_dmg02 => to_char(c2.date_of_birth,'YYYYMMDD')

21:50:51 679                                           ,p_dmg03 => c2.sex);

21:50:51 680                    write_segment(ws_file_handle, l_segment,l_segment_count);

21:50:51 681 

 

21:50:51 787  when others then

21:50:51 788       dbms_output.put_line(substr(sqlerrm,1,250));

21:50:51 789       utl_file.fclose(ws_file_handle);

21:50:51 790      DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);

21:50:51 791      DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_backtrace);

21:50:51 792       raise_application_error(-20000, '**** FATAL ERROR! ****');

21:50:51 793  

 

449983464,468623157,"Carpenter, Marian",01,02/01/2010

449983464,400477316,"Carpenter, Abigail",19,02/01/2010

449983464,404453088,"Carpenter, Blair",19,02/01/2010

Error Type: 1

Element Reference: DMG03

rror: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

 

ORA-06512: at line 676

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 792

 

Jennifer or Teri,

Can you please check out the Carpenter dependents and make sure they all have a gender? This job aborted this morning. We have updated  the form to make this field required to eliminate this data issue, but it will not be in production until today or tomorrow.

Please let us and IT scheduling know when you have this fixed and they will restart the job.

Kathy.

 

Hi Jackie,

I request for you to investigate this as the responsible party for new hire entry in our office now.  Let me know if you need any help.   Teri

P.S.  If you see the gender missing, just make sure you are date tracked appropriately and add it.  Then notify Kathy to proceed with the file.

 

 

 

 

 

Aborted Module Name:   AREGCNTB.ODSRS100_01

 

  Date:        Day:      Time:          Resolution:

02/08/10     Mon       08:40          Restarted by ITS.

 

Error log and follow up comments:

 

 

 

02:01:03 255  --*--------------------------------------------------------------------*

02:01:03 256  --************ ADD Records to CUR table from view course_schedule *****

02:01:03 257  --*--------------------------------------------------------------------*

02:01:03 258                begin <<add_cur3>>

02:01:03 259                  insert into csus_applicant_cen_cur

02:01:03 260                  (select * from csus_applicant

02:01:03 261                   where ltrim(rtrim(term)) = csus_f_cur_term_ods);

02:01:03 262                end add_cur3;

02:01:03 263                v_add3_count := SQL%ROWCOUNT;

02:01:03 264 

02:01:03 265           end del_applicant;

02:01:03 266 

02:01:03 267  --*************************************************************************

02:01:03 268  --  FIELD OF STUDY

02:01:03 269  --*************************************************************************

 

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-00001: unique constraint (CSUBAN.CSUS_APPLICANT_CEN_CUR_IX_01) violated

ORA-06512: at line 259

 

 

The problem appears to be with PIDM = 11150486, APLN_REF_NUMBER = 3

There are two entries for this person.  I will work with folks to figure out what the data issue is and we will get it resolved and finish creating the CENSUS Tables.

Vicki.

 

 

 

 

 

Aborted Module Name:   HRMSS041.HRMSS041_01

  Date:        Day:      Time:          Resolution:

02/09/10     Mon       08:40          Restarted by ITS.

 

Error log and follow up comments:

 

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 676

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 792

 

08:40:03 676                            l_segment := csuh_edi_834_pkg.edi_dmg(

 

08:40:03 788            dbms_output.put_line(substr(sqlerrm,1,250));

08:40:03 789            utl_file.fclose(ws_file_handle);

08:40:03 790           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);

08:40:03 791           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_backtrace);

08:40:03 792            raise_application_error(-20000, '**** FATAL ERROR! ****');

 

I.T. Scheduling,

Can you please add line 676  and the lines around 676 plus the last couple of lines from the output so we can see what person it aborted on?

Kathy.

 

08:40:03 674 

08:40:03 675              -- Member Demographics

08:40:03 676                            l_segment := csuh_edi_834_pkg.edi_dmg(

08:40:03 677                                                            p_dmg01 => 'D8'

08:40:03 678                                                           ,p_dmg02 => to_char(c2.date_of_birth,'YYYYMMDD')

08:40:03 679                                                           ,p_dmg03 => c2.sex);

08:40:03 680                            write_segment(ws_file_handle, l_segment,l_segment_count);

08:40:03 681 

08:40:03 682                            -- Health Coverage

08:40:03 683                            l_segment := csuh_edi_834_pkg.edi_hd(

08:40:03 684                                                            p_hd01 => '030'

08:40:03 685                                                           ,p_hd03 => 'DEN'

08:40:03 686

                                                                    ,p_hd04 =>

551797647,526992512,"Tjalkens, Kimberly",01,10/01/2009

551797647,645807196,"Tjalkens, Jacob C",19,10/01/2009

551797647,652071440,"Tjalkens, Jordan F",19,10/01/2009

551797647,627806573,"Tjalkens, Luke R",19,10/01/2009

551806928,****,"Machol, Janet Lynn",EMP,Y,10/01/2009

551806928,,"Machol, Janet Lynn",EMP,10/01/2009

551911231,****,"Schroder, Daniel James",FAM,Y,02/01/2010

551911231,,"Schroder, Daniel James",FAM,02/01/2010

551911231,341763917,"Schroder, Amy Gillings",01,02/01/2010

551911231,652388519,"Schroder, Finn Joseph",19,02/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 676

Thank you so much. This is exactly what we need and we can just forward this onto HR to fix without us having to do any investigative work.

Kg.

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

02/08/10     Mon       15:02          Restarted by David.

 

Error log and follow up comments:

 

 

org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase.validateContentsAgainstSchema(XmlBatchInputFileTypeBase.java:172)

                at org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase.parse(XmlBatchInputFileTypeBase.java:109)

                at org.kuali.kfs.sys.batch.service.impl.BatchInputFileServiceImpl.parse(BatchInputFileServiceImpl.java:73)

                at org.kuali.kfs.fp.batch.service.impl.ProcurementCardLoadTransactionsServiceImpl.loadProcurementCardFile(ProcurementCardLoadTransactionsServiceImpl.java:67)

                ... 4 more

Caused by: org.xml.sax.SAXParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element 'transaction'. One of '{"http://www.kuali.org/kfs/fp/procurementCard":transactionCreditCardNumber}' is expected.

                at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)

                ... 23 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

David.

 

Someone’s CC# was wrong.  We suspect it was manually entered by Purchasing and they entered the last digit incorrectly.  (they have been running some cards through the State’s PaymentNet system as a test).

 

The system have been corrected.  KFSXS008 will need to be rerun. I am looking to see if there are anything else that needs to be cleaned up in the database.  I will contact David when I am ready to rerun.

Kevin.

 

 

 

 

Aborted Module Name: AGENWYWP.AGENS004_01

 

  Date:        Day:      Time:          Resolution:

02/08/10     Mon       20:17          Restarted by ITS.

 

Error log and follow up comments:

 

 

18:00:46 559    put_report_line1('Persons Not Purged: ' || record_count1);

18:00:46 560    put_report_line2('Persons Purged: ' || record_count2);

18:00:46 561  -- report any records that were not able to process because the pidm

18:00:46 562  --  is not found in Banner

18:00:46 563  csug_notworked := 0;

18:00:46 564  v_rptlist1 := null;

18:00:46 565 

18:00:46 566  select count(*) into csug_notworked

18:00:46 567      from csug_purge_ids

18:00:46 568      where marked_flag != 'Y';

 

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

ORA-04088: error during execution of trigger 'GENERAL.GT_GOREMAL_AS_LDI'

ORA-06512: at line 602

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

18:00:46   5    file_handle     utl_file.file_type;

18:00:46 468  --       purgeable report

18:00:46 602    raise;

 

Robin:  Not sure if this will make much sense, but in the errors:

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

ORA-04088: error during execution of trigger 'GENERAL.GT_GOREMAL_AS_LDI'

ORA-06512: at line 602

The only one that refers to the sql (AGENS004) is the last one.  That’s why you couldn’t find line 675.  Line 675 refers to the database package BANINST1.ICGOKCOM and so on for the other errors.  We do need to see this entire error, but you won’t find anything helpful in the output for anything other than the last line.

Please restart AGENS004.  We think we’ve got the data fixed.

Bev.

 

 

 

 

 

Aborted Module Name:   HRMSCPR_QPS.HRMSS063_01

 

  Date:        Day:      Time:          Resolution:

02/11/10      Thu        08:18          Restarted by ITS.

 

Error log and follow up comments:

 

 

Thu Feb 11 08:18:31 :**** Start of HRMSS063 02/11/2010 08:18:28

Thu Feb 11 08:18:31 :Org Default Account:            Williams,Kathleen   Regular Salary 9 Month

Thu Feb 11 08:18:31 :685.00

Thu Feb 11 08:18:31 :Amount Not Distributed: Schwartz,Rachel     Supp Pay Misc                      533144

Thu Feb 11 08:18:31 :3262.26

Thu Feb 11 08:18:31 :declare

Thu Feb 11 08:18:31 :*

Thu Feb 11 08:18:31 :ERROR at line 1:

Thu Feb 11 08:18:31 :ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

Thu Feb 11 08:18:31 :ORA-06512: at line 1062

Jan.

 

This problem has been resolved.  Account 5331440 was being used on the cost allocation key flex field but it was not on the GL_CODE_COMBINATIONS table.  My guess is that this account has never been used on a labor schedule so GL_CODE_COMBINATIONS was never updated with the account value.  I manually entered the value using the GL Account form which should cause the error to go away the next time the script is executed.

Steve.

 

 

 

 

 

 

 

Aborted Module Name:   HRMSWKSP_01.AROSS142_01

 

  Date:        Day:      Time:          Resolution:

02/15/10      Mon      21:56          Deleted by ITS.

 

Error log and follow up comments:

 

 

21:55:17 253                          dbms_output.put_line('Account Does Not Exist: ' ||

21:55:17 254                                                p_row.employee_csu_id|| ' ' ||

21:55:17 255                                                p_row.account_number ||' ' ||

21:55:17 256                                                va_subcode);

21:55:17 257                          raise_application_error(-20100,  'Account Does Not Exist in TBRACCT, Account B');

21:55:17 258                   end if;

21:55:17 259               end;

21:55:17 260              END;

21:55:17 488       begin

21:55:17 489          --Call the insert rows proc to create transactions and comments.

21:55:17 490          insert_rows(p_rec,

21:55:17 491                      vt_hold,

21:55:17 492                      vc_hold);

21:55:17 493          --Populate the Count Variables

21:55:17 494          vtran_count := nvl(vtran_count, 0) + nvl(vt_hold, 0);

21:55:17 495          vcomment_count := nvl(vcomment_count, 0) + nvl(vc_hold, 0);

21:55:17 496       end;

21:55:17 497    end loop;

 

ERROR at line 1:

ORA-20100: Account Does Not Exist in TBRACCT, Account B

ORA-06512: at line 257

ORA-06512: at line 490

 

Josh,

Is this that off-campus workstudy billing job?   Appears a new account number needs to be setup in AROS?

Kevin.

 

The work study folks are out of the office today.

Josh.

 

We deleted the necessary modules and the chain completed.

Dawn.

 

 

 

Aborted Module Name:  HRMSCRU_SAL.HRMS_SPAWN_LOG_01

 

  Date:        Day:      Time:          Resolution:

02/18/10     Thu        09:45          Restarted by Janice.

 

Error log and follow up comments:

 

 

+ print *** \n*** LOG FROM SPAWNED CONCURRENT REQUEST 4613252 (PARENT REQUEST 4613251): \n***

+ 1>> /ais01/dat/work/prod/HRMSCRU_SAL.HRMS_SPAWN_LOG_01.Spawned_Log

+ cat /oraapps/hrprod/log/l4613252.req

+ 1>> /ais01/dat/work/prod/HRMSCRU_SAL.HRMS_SPAWN_LOG_01.Spawned_Log

+ read this_spawned_req

+ grep C

+ cut -f2 -d ?

+ print 4613253?X

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

 

The restart of phase 1 will be tricky – Diane will work with Jan, David or me to facilitate the rerun.  We will force the existing HRMSSAL1 Phase 1 chain to stay “stalled” by placing a hold on the downstream HRMSSAL1.CHAIN_EXIT_01 chain component.  We plan to re-run portions of the HRMSCRU_COSTING_RUNGEN_SAL chain as a stand-alone chain.  Since HRMSCRU_COSTING_RUNGEN_SAL  is a “single run” chain, we deleted the ABORTED HRMSCRU_SAL.HRMS_SPAWN_LOG_01 and all remaining chain components for this subchain except for the HRMSCRU_SAL.CHAIN_FINISH_01.    The stand-alone run of HRMSCRU_COSTING_RUNGEN_SAL cannot be done until the rollbacks are performed within HR – Diane will let us know when that has completed. No action will be required by IT Scheduling.    

Joanne,

I’m not sure if we have ever been in this exact situation before.  Unfortunately, the manner in which the HR processes were terminated did not communicate failure of those processes to the Applications Manager chain.  Consequently, the RUNGEN component within the HRMSCRU_COSTING_RUNGEN_SAL sub-chain of the HRMSSAL1_PHASE_1_JOBS chain completed successfully – and there is no way for us to rerun that component within the existing  HRMSSAL1 chain once it has completed within Applications Manager.  However, one of the reasons that the Applications Manager Payroll chains were designed with sub-chains is to facilitate ease of re-running portions of the payroll processing in situations such as we have today.  By rerunning the RUNGEN via a stand-alone execution of the HRMSCRU_COSTING_RUNGEN_SAL chain, it will most effectively allow us to re-run just the key component(s) that **need** to be rerun and will be much less error prone and less time consuming  than trying to modify and rerun the entire HRMSSAL1_PHASE_1_JOBS chain.

Regarding a projected time-frame, I can tell you that historically the RUNGEN program averages  between 1 – 1 ½ hours to run (when it starts at 03:00 A.M, which is an idle time as far as online activity).  However, we cannot even proceed with the RUNGEN rerun until all the rollbacks are completed – Craig/Diane may be able to provide an update regarding progress of that.

If you have any more questions, please let me know.

Janice.

Fyi – the Database rollback process completed and the Payroll Run Rollback is currently underway………….Janice.

Salary payroll has resumed processing. Plan on about 1.5 hours…………..Diane

 

IT Scheduling:

Please monitor the progress of the stand-alone HRMSCRU_COSTING_RUNGEN chain.  Upon successful completion of HRMSCRU_COSTING_RUNGEN (chain id 3957236 ), please release the hold on HRMSSAL1.CHAIN_EXIT_01 to allow the remainder of the HRMSSAL1_PHASE_1_JOBS chain to complete. 

Janice.

The salary payroll job has finished. All assignments processed. 3 errors. The Payroll Exception Report is running right now.

Diane.

 

 

 

 

 

Aborted Module Name:   KFSXAPPD.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

02/22/10     Mon        07:29          Restarted by Janice.

 

 

Error log and follow up comments:

 

2010-02-22 07:29:08,831 [main] INFO  org.kuali.rice.kew.docsearch.SearchableAttribute :: Indexing document 628342 for document search...

JVMDUMP006I Processing Dump Event "systhrow", detail "java/lang/OutOfMemoryError" - Please Wait.

JVMDUMP006I Processing Dump Event "systhrow", detail "java/lang/OutOfMemoryError" - Please Wait.

JVMDUMP010I Java Dump written to /app/kfs/javacore.20100222.143045.414050.txt

JVMDUMP013I Processed Dump Event "systhrow", detail "java/lang/OutOfMemoryError".

Exception in thread "QuartzScheduler_QuartzSchedulerThread" Exception in thread "Timer-0" java.lang.OutOfMemoryError

java.lang.OutOfMemoryError

 

Complete java log can be viewed in:

/ais02/log/KFSXAPPD.KFSX_JAVA_01.3970577.3970615.00.2010_02_22_0701.log        

 

Should we try to increase catalina_opts_memory?

Janice.

 

 

 

 

 

Aborted Module Name:   KFSXPDSA.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

02/25/10     Thu        08:16          Restarted by Jan.

 

 

Error log and follow up comments:

 

nested exception is:

      com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <hayley.brown@colostate.edu>... User unknown

      at org.kuali.rice.kns.service.impl.MailServiceImpl.sendMessage(MailServiceImpl.java:63)

 

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

***

 

I02492 - Fix KFSXPDSA for 02/25/2010

Needs to be assigned to a dba.  Once the update statements run the job can be restarted.

Kevin.

 

This task has been completed.

Kelly.

 

KFSXPDSA.KFSX_JAVA_01 failed again with below invalid user.

Jan.

 

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <james.kunesh@colostate.edu>... User unknown

;

  nested exception is:

                com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <james.kunesh@colostate.edu>... User unknown

 

I added task T04672 to Incident I02492 to fix james.kunesh@colostate.edu.

Kevin.

 

 

 

 

Aborted Module Name:   FAIDCFAT_SM_GLBDATA-LOOP_01

  Date:        Day:      Time:          Resolution:

02/26/10     Fri         06:19          Restarted by David.

 

Error log and follow up comments:

 

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

Here is the error from the report:

SUNGARD HIGHER EDUCATION                                                     

                                                     POPULATION SELECTION EXTRACT                                                  

                                                          CONTROL REPORT                                               PAGE       1

                                                                                                                                   

              Start Time: 26-FEB-2010 06:16:21                                                                                     

         GLBDATA Version: 8.1                                                                                                      

          Selection ID 1: FAIDCFAT_SM_GRIP                                                                                         

             Application: FINAID                                                                                                   

              Creator ID: FAUSER                                                                                                   

                                                                                                                                  

*ERROR* FAIDCFAT_SM_GRIP query does not exist for Applicatio                                                                       

  SQLCODE = 1403                                                                                                                   

SQL ERROR = ORA-01403: no data found

Same error for Spring:

*ERROR* FAIDCFAT_SP_GRIP query does not exist for Applicatio                                                                       

  SQLCODE = 1403                                                                                                                    

SQL ERROR = ORA-01403: no data found

David.

 

 

 

 

Aborted Module Name: FAIDCFAT_SP.GLBDATA-LOOP_01

 

  Date:        Day:      Time:          Resolution:

02/26/10     Fri         06:06          Restarted by David.

 

Error log and follow up comments:

 

 

 

 

+ print Failure in spawned GLBDATA - abort this module

Failure in spawned GLBDATA - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

Here is the error from the report:

SUNGARD HIGHER EDUCATION                                                     

                                                     POPULATION SELECTION EXTRACT                                                  

                                                          CONTROL REPORT                                               PAGE       1

                                                                                                                                   

              Start Time: 26-FEB-2010 06:16:21                                                                                     

         GLBDATA Version: 8.1                                                                                                      

          Selection ID 1: FAIDCFAT_SM_GRIP                                                                                         

             Application: FINAID                                                                                                   

              Creator ID: FAUSER                                                                                                   

                                                                                                                                  

*ERROR* FAIDCFAT_SM_GRIP query does not exist for Applicatio                                                                       

  SQLCODE = 1403                                                                                                                   

SQL ERROR = ORA-01403: no data found

Same error for Spring:

*ERROR* FAIDCFAT_SP_GRIP query does not exist for Applicatio                                                                       

  SQLCODE = 1403                                                                                                                    

SQL ERROR = ORA-01403: no data found

David.

 

 

 

 

 

Aborted Module Name: HRMSCHK_QPS.CHECK_WRITER_02

 

  Date:        Day:      Time:          Resolution:

02/26/10     Fri         08:10          Restarted by ITS.

 

Error log and follow up comments:

 

The following module is in DB ERROR status:

 

HRMSCHK_QPS.CHECK_WRITER_02

 

There is no output file.  We also looked at the Before and Performed conditions.  We viewed the operator log.

 

Can we just re-start this module?

 

Yes, Try to restart it.

David.

 

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_03

  Date:        Day:      Time:          Resolution:

03/01/10     Mon       19:20           Restarted by ITS.

03/10/10     Wed       19:01           Restarted by Janice.

 

Error log and follow up comments:

 

 

03/01/10.    

Pcard transactions didn’t “push” out to people’s action lists.

Bad email?

John Hunter

 

KFSXFPPC.KFSX_JAVA_03 failed with a bad email address.

 

2010-03-01 19:13:55,618 [main] ERROR org.kuali.rice.kew.mail.service.impl.ActionListEmailServiceImpl :: Error sending Acti

on List email.

org.kuali.rice.kew.exception.WorkflowRuntimeException: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Lisa.Klopp@ColoState.EDU>... User unknown

Jan.

 

I updated her email, go ahead and re-run.

John Hunter

 

03/10/10.

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXFPPC.procurementCardRouteDocumentsStep.4048572.4048580.00*.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

***

+ abort_job_flag=y

+ rm -ef /ais01/dat/work/prod/KFSXFPPC.KFSX_JAVA_03_jobstat

rm: Removing /ais01/dat/work/prod/KFSXFPPC.KFSX_JAVA_03_jobstat

 

Looks like it finished successfully at 7:31?

Kevin.

 

Looks like this java program failed last night with:

/ais02/job/temp/kfsx_java_ssh.ksh[79]: 270374 Segmentation fault(coredump)     

I simply tried to rerun it this morning and it finished successfully.

Janice.

 

 

 

 

Aborted Module Name:   AGENWYWP.AGENS004_01

 

  Date:        Day:      Time:          Resolution:

03/04/10     Thu       10:47            Deleted by ITS.

Error log and follow up comments:

 

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 602

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

 

+ /Applications Manager/exec/FILESIZE AGENWYWP.AGENS004_01.4020586.4020591.00.2010_03_04_1047.jobout 100

no output from AGENWYWP.AGENS004_01

+ err=100

 

We know why this job aborted.  Joe will be in contact with Vicki regarding the person that we need to have purged from the system.

Marcella .

 

This job will  hang until tomorrow per Vicki.

David.

 

Marcella confirmed that we can purge/drop/stop AGENWYWP.  We have identified the data problem and are starting the process to clean that up.  We will just catch up next week.

Vicki.

 

 

 

 

Aborted Module Name:   HRMSS041.HRMSS041_01

  Date:        Day:      Time:          Resolution:

03/05/10     Fri         21:42            Restarted by Jan.

 

Error log and follow up comments:

 

 

+ date

+ echo exiting  SQLP_CSU Fri Mar 5 21:42:30 MST 2010

exiting  SQLP_CSU Fri Mar 5 21:42:30 MST 2010

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

521151875,,"Collier, Daye Jamal",E1D,03/01/2010

521151875,995000026,"Collier, Dayesum Arthur",19,03/01/2010

Error Type: 1

Element Reference: DMG03

Element Value: <NULL>

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

 

ORA-06512: at line 664

Declare

Jan.

 

Hi Teri and Jennifer,

The gender is missing for the following person. Please fix this ASAP and let IT Scheduling know so they can restart the job.

Thanks . Kg.

 

 

 

Aborted Module Name:   OSYSJOBS_01.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

 

Aborted Module Name:   OSYSJOBS_02.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_03.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

 

Aborted Module Name:   OSYSJOBS_05.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_08.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

 

 

Aborted Module Name:   OSYSJOBS_09.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_11.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

 

Aborted Module Name:   OSYSJOBS_12.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_13.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_15.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name: OSYSJOBS_04.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

 

Aborted Module Name:   OSYSJOBS_06.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_07.OSYSPURG_01

 

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   OSYSJOBS_14.OSYSPURG_01

  Date:        Day:      Time:          Resolution:

03/10/10     Wed       16:46           Restarted by ITS

03/20/10     Fri          16:32           Restarted by Jan.

 

Error log and follow up comments:

 

 

03/10/10.

ERROR: Tomichi SCRIPT ABORTED - EXIT CODE=2

***

<<errtrap_rsh.7>> exit 2

+ grep SCRIPT ABORTED /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/OSYSJOBS_01.OSYSPURG_01.4048750.4048754.00.2010_03_10_1630.log

+ grep SCRIPT ABORTED

 

10 16:30:34-Null

Exiting with su job error code[2].

 

There was a syntax error in the ais02/job/temp/sys_purg_rsh.ksh script (missing a then within a newly added if statement).  I corrected the syntax error and resubmitted a few of the failed OSYSPURG components – they finished successfully. 

IT Scheduling:

Please reset the remainder of the failed OSYSPURG components.

Janice.

 

03/20/10.

This is the error Rich found and listed in the news:

cp:

/app/oracle/product/midtier_10.1.2.ban/opmn/logs/OC4J~RAMCTimportGrades~default_island~1.bak.2010_03_11

_0722.bak.2010_03_11_1630.bak.2010_03_12_1630.bak.2010_03_13_1630.bak.2010_03_14_1631.bak.2010_03_15_1630.b

ak.2010_03_16_1630.bak.2010_03_17_1630.bak.2010_03_18_1630.bak.2010_03_19_1631.bak.2010_03_20_1632:

A file or path name is too long.

 

 

 

Aborted Module Name:   HRMSCPR_SAL_HRMSS063_01

 

  Date:        Day:      Time:          Resolution:

03/23/10     Tue          13:53          Restarted by David.

 

Error log and follow up comments:

 

Org Default Account:    Mashek,Kimberly     Regular Salary

896.94

Org Default Account:    Mashek,Kimberly     Regular Salary

323.89

Org Default Account:    Roberts,James       Retro Salary

375.00

Org Default Account:    Florcke,Cornelia    Regular Salary

1470.00

declare

*

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1062

 

This should now be fixed so the HRMSS063 script can be restarted.  Account 1206320 needed to be added to the GL code combinations table.

Steve Hill.

 

 

 

 

Aborted Module Name:   DB ERROR’S

  Date:        Day:      Time:          Resolution:

03/15/10     Mon      23:00           Restarted by David.

05/17/10     Mon      10:00           See note from Janice below.

05/20/10     Thu       20:00           Restarted by David.

 

Error log and follow up comments:

 

I was paged because the following modules were in DB ERROR status:

WHRSL021.CHAIN_CANCEL_01

WHRSL036.CHAIN_CANCEL_01

The first one listed above did not have an output file, however the conditions were present with Timing of “BEFORE” and Performed of “DONE”, so I called the Oncall  Programmer.  The second one listed above had an output file present, so I called the Oncall Programmer.  David Peterson was the person I spoke to.

Dawn.

 

WHRSL021 and WHRSL036 both had DB Errors on the CHAIN_CANCEL modules. I reset the conditions on WHRSL021 and reset it and it completed okay. WHRSL036 however is stuck in backlog. All modules are in finished status and I am unable to delete it. I added this chain to the exception file so WHRS could finish.

 

Followup:

Greg or Rich will need to delete WHRSL036 from backlog in the morning.

David.

 

The following module is in DB ERROR status.

HRMSACH_HRL.NACHA_01

This component went into DBERROR status when an AFTER condition attempted to run underlying sql (against HRPROD  using the @hrprod_apps link from AWPROD)  for the  #HRMS_PAYROLL_ACTION_ID subvar.

ErrorMsg: AwE-5001 Database Query Error (5/17/10 9:49 AM)                      

Details: Error evaluating subvar #HRMS_PAYROLL_ACTION_ID for job: 4408012      

java.sql.SQLException: ORA-12154: TNS:could not resolve the connect identifier specified

 

There is really no way to recover from this other than manually performing this AFTER condition since we cannot rerun this chain component.  So, I determined the correct value for Payroll Action ID and updated the #HRMSACH_HRL_PAYROLL_ACTION_ID with the value (30752000).  Then I deleted the HRMSACH_HRL.NACHA_01 chain component to allow the remainder of the chain to complete.

Janice.

 

05/20/2010 21:24    DBARRETT

Received DB Error page on KFSXGLSC_D1.COLLECT_FILES_01.

As there was an output file present, I called David who will investigate.

05/20/2010 22:28    DEPETERS

DB Error occured on KFSXGLSC_D1.COLLECT_FILES_01. I reset conditions and re-started it.

 

 



 

Aborted Module Name:   OSYSJOBS_04.OSYSPURG_01

  Date:        Day:      Time:          Resolution:

03/29/10     Mon       16:30           Restarted by Jan.

 

Error log and follow up comments:

 

 

Rich called and was nice enough to let me know that it is sometimes easier to find an error in the output files if you look for code=.  For example, if you open the output file on OSYSJOBS_04.OSYSPURG_01 that is in ABORTED status right now and you look for “code=”, then you will find the real reason the module aborted just above the place where you found “code=”.

 

See below: (The actual error is: The starting directory is not valid)

 

find: 0652-010 The starting directory is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.959#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.80#> [[ 1 > 0 ]]

<#errtrap_rsh.80#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

***

 

You would want to include the directory with it. It is usually a find or rm command that failed. This job has two output files, it failed on /work/tmp and then alm_orautil when I restarted it. The error is usually a short distance before it. It seems no one has pointed this out to you, but I think it is important ITS is told if they have the right message and learn from it. 

Rich.

 

 

 

 

 

Aborted Module Name:  VPLUS_LIST_KFSX_REPORT_ACCESS

 

  Date:        Day:      Time:          Resolution:

03/31/10     Wed       06:02           Restarted by ITS.

 

Error log and follow up comments:

 

I was checking my emails before getting ready for work and noticed the VPLUS_KFSX_PREPORT_01 job Aborted. Take a look at this error. It seems there is an invalid entry in the vplus_list_report_access.txt file which is causing the problem. Verify the name is correct and it exists.

 

Corrected report names and restarted module (see above)

 

echo \n*** REPORT:CSUFR091_enc_AA  \n

+ 1>> /ais01/spool/vplus/out/vplus_list_kfsx_report_access.txt

+ vadmin com=gsr rep=RptViewN

params=FORMAT=COMMA,REPORT_NAME=CSUFR091_enc_AA

+ 1>> /ais01/spool/vplus/out/vplus_list_kfsx_report_access.txt

Invalid parameter value: CSUFR091_enc_AA.

+ exit 208

  Child: Job return = 208

Rich.

 

I’m not sure how to proceed.

The user requested a name change for CSUFR091_enc_AA report on March 22nd. The report name is now CSUFR_Encumbrance_AA.

Joleen.

 

This job failed because a report was renamed in Vista for kfsx and not renamed in the input file which reports on them.

You need to rename the report in file /ais01/spool/vplus/parms/vplus_list_kfsx_report_access_input

When I first built this list I sorted it. If you rename any files in it, you can do a sort command from the top line in ispf.

There might be others you need to fix based on when you renamed in Vista. The job failed where it got the first error, so maybe there are more after it.

Better check them all to be safe if not sure.

Rich.

 

Joleen, when the KFSX user requests to rename a report in vista plus, ITS will be responsible for modifying the report file /ais01/spool/vplus/parms/vplus_list_kfsx_report_access_input.     

Please document this process and also include instructions of taking a screen shot before making any changes that are required.  

Steve, thank you for modifying the file and once your done,  please reset/restart VSTAJOBS_VISTA_RELATED_JOBS……….Debbie.

 

 

 

 

 

 

 

 

 

Aborted Module Name:   KFSXTXPM.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

04/02/10     Fri         10:47           Killed by Jan.

 

Error log and follow up comments:

 

 

2010-04-02 10:47:14,367 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) edu.csu.batch.exceptions.BatchServerException: Batch Server Exception:: :: Job Name - KFSXTXPM.payeeMasterExtractStep.4155985.4155986.00, StepName(s) - [payeeMasterExtractStep]

R     unBatch ERROR: Exception found:

edu.csu.batch.exceptions.BatchServerException: Batch Server Exception:: :: Job Name - KFSXTXPM.payeeMasterExtractStep.4155985.4155986.00, StepName(s) - [payeeMasterExtractStep]

      at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:76)

      at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

 

rSmart fixed the bug AFTER we did the cod merge for 3.0.1.  I have pulled their code change into our project.  It will need to be tested and applied to prod.  So this chain can be killed.

 

I killed the chain.

Jan.

 

 

 

Aborted Module Name:  KFSXPDAP.KFSX_JAVA_02

  Date:        Day:      Time:          Resolution:

04/06/10     Tue        08:09           Restarted by ITS.

 

Error log and follow up comments:

 

 

Caused by: java.net.ConnectException: A remote host refused an attempted connect operation.

      at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:352)

      at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:214)

      at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:201)

      at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:377)

      at java.net.Socket.connect(Socket.java:530)

      at java.net.Socket.connect(Socket.java:480)

      at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:232)

      at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:189)

      at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1250)

      ... 32 more

<#/ais02/job/temp/kfsx_java_ssh.ksh.79#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.22#> [[ 1 > 0 ]]

<#errtrap_ssh.22#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

 

There appears to have been a problem connecting to the mail server. The systems team verified that SMTP looks good on Malta, so can we try re-running the job.

 

Thanks……………John Walker.

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDALCT.SSH_SFTP_DL_01

  Date:        Day:      Time:          Resolution:

08/31/12     Fri         08:31           Restarted by Joleen.

11/08/13     Fri         07:04           Restarted by Steve.

06/18/13     Fri         07:04           Restarted by Joleen.

 

Error log and follow up comments:

 

08/31/12.

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

 

I was able to manually connect to the server now.  Therefore, as long as the Process Flow notes don't preclude it, you should be able to restart the component.

Elden.

 

11/08/13.

FAIDALCT.SSH_SFTP_DL_01 / SFTP_FILSEND is stalled on AWPROD with an EMPTY FILE status.

 

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

 

I restarted this and it finished…

Steve G.

 

06/18/13.

# > Write failed: Broken pipe

# > Connection closed

# > (255)

 

I restarted and the job finished.

Joleen.

 

 

 

Aborted Module Name:   HRMSS041.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

04/16/10     Fri          21:38           Restarted by Jan.

 

 

Error log and follow up comments:

 

# - sftp

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_ebmstpa_csu"  CSU@ftp.ebms.com

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

# > Someone could be eavesdropping on you right now (man-in-the-middle attack)!

# > It is also possible that the DSA host key has just been changed.

# > The fingerprint for the DSA key sent by the remote host is

# > 7e:8f:8e:dc:7f:74:f9:7b:0f:d9:13:95:32:12:e3:a4.

# > Please contact your system administrator.

# > Add correct host key in /home/jobprd/.ssh/known_hosts to get rid of this message.

# > Offending key in /home/jobprd/.ssh/known_hosts:10

# > DSA host key for ftp.ebms.com has changed and you have requested strict checking.

# > Host key verification failed.

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

 

I have asked HR to contact EBMS to let them know the server key has changed.

Diane.

 

 

 

 

Aborted Module Name:   AREGORAH.AREGS608_01

 

  Date:        Day:      Time:          Resolution:

10/10/12      Wed      13:53          Restarted by Joleen.

 

Error log and follow up comments:

 

 

ERROR at line 1:

ORA-29283: invalid file operation

ORA-06512: at "SYS.UTL_FILE", line 536

ORA-29283: invalid file operation

ORA-06512: at line 234

 

Rob noticed that the userfile was missing a couple characters. I renamed the file and restarted the job. It completed successfully.

Joleen.

 

 

 

 

Aborted Module Name:   KFSXPDAP.KFSX_JAVA_02

  Date:        Day:      Time:          Resolution:

04/27/10      Tue       08:02           Restarted by ITS.

 

 

Error log and follow up comments:

 

 

 

+ grep SCRIPT ABORTED /ais02/log/KFSXPDAP.KFSX_JAVA_02.4291790.4291814.00.2010_04_27_0802.log

+ 1> /dev/null

+ + grep ^*** ERROR: /ais02/log/KFSXPDAP.KFSX_JAVA_02.4291790.4291814.00.2010_04_27_0802.log

+ grep SCRIPT ABORTED

+ cut -f 2 -d =

ssh_return_code=1

+ [[ RunBatch = RunBatch ]]

+ grep KFSXPDAP.KFSX_JAVA_02 /ais01/dat/kfsx/prod/KFSX_PDF_PREFIX_TO_VPLUSRPT

+ chain_vplus_tempdir=/ais01/spool/vplus/temp/KFSXPDAP_tempdir

+ [[ RunBatch = RunBatch ]]

+ [[ -d /ais01/spool/vplus/temp/KFSXPDAP_tempdir ]]

+ print *** \n*** SSH EXECUTED SCRIPT kfsx_java_ssh.ksh EXIT CODE=1  \n*** EXIT WITH EXIT CODE=1  \n***

***

*** SSH EXECUTED SCRIPT kfsx_java_ssh.ksh EXIT CODE=1 

*** EXIT WITH EXIT CODE=1 

***

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

This appears to be a problem connecting to SMTP/Mail. We’ve had this in the past and only need to restart the job.

RunBatch ERROR: Exception found:

org.springframework.mail.MailSendException; nested exceptions (0) are:

Caused by: javax.mail.MessagingException: Could not connect to SMTP host: smtp.colostate.edu, port: 25;

  nested exception is:

        java.net.ConnectException: A remote host refused an attempted connect operation.

Thanks……….John Walker.

 

 

 

 

 

Aborted Module Name:   FAIDEPLS_OD.LYNX_01

  Date:        Day:      Time:          Resolution:

11/30/12      Fri         14:06          Se follow up below.

 

Error log and follow up comments:

 

It seems their website is down  !

Whom do we call ?

 

$ ls /appworx/out/LYNX_9435429.00.stdout.txt

/appworx/out/LYNX_9435429.00.stdout.txt

$ more /appworx/out/LYNX_9435429.00.stdout.txt

Gudrun.

 

I called Candy Chapman. The URL needed to be updated. I made the change and restarted the job. Candy is going to send a list of all the LYNX that have wsprod that need to be changed to wsnet.

Joleen.

 

From:

http://wsprod.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay={#2}&treq={#3}

To:

http://wsnet.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay={#2}&treq={#3}

 

 

 

 

 

Aborted Module Name:   AROSFRQ1.AROS-PYMTS-LOOP-01

  Date:        Day:      Time:          Resolution:

04/30/10      Fri         08:35          Restarted by Janice.

 

 

Error log and follow up comments:

 

 

Error:-999 -ERR 171 Database error on 'AW_REQUEST' - ORA-20006: User "APPWORX" is not active - contact I.T. Scheduling at 491-1375

ORA-06512: at "APPWORX.AW5", line 110

ORA-06512: at "APPWORX.AW5", line 54

ORA-06512: at line 1

 

 

The Appworx userid was deactivated by a failed mkbanner.  Mark will research why this happened – might be an issue with the banprod_general link.  The mkbanner works fine on AWTEST, using the bantest_general link.  I reset the appworx userid within Appworx to “active” and restarted the jobs which had failed with this error:

Janice.

 

 

 

Aborted Module Name:   FAIDTMIM.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

05/06/10     Thu         08:30          Restarted by David.

11/18/13     Mon        08:31          Restarted by David.

 

Error log and follow up comments:

 

05/06/10.

+ 1>> FAIDTMIM_receive_cmdfile

+ read this_receive_tdclient

+ tdclient_out=/ais01/dat/work/prod/FAIDTMIM.TDCLIENT_01.non_isir

+ tdclientc cmdfile=FAIDTMIM_receive_cmdfile

+ 1> /ais01/dat/work/prod/FAIDTMIM.TDCLIENT_01.non_isir 2>& 1

+ exit 21

+ err=21

+ [ 21 -eq 0 ]

+ [ 21 != 0 ]

+ status=ABORTD

 

Tom fixed the value in the #FAID_TDCLIENT_CUR_PASSWORD subvar.  However, for the aborted component in backlog, changing the value in the subvar has no impact because the FAIDSPWD.TDCLIENT_01 prompt #3 value had already been populated with the old value from the #FAID_TDCLIENT_CUR_PASSWORD subvar when this component was originally submitted to run.  Resolution of this problem could have been approached either of the following ways:

1)        Delete the chain and request it to run again

2)        Modify the FAIDSPWD.TDCLIENT_01 prompt #3 (current password) in backlog to match the value which Tom had updated to  #FAID_TDCLIENT_CUR_PASSWORD subvar and restart the failed component.

 

David and I chose option #2 to resolve the abort.    However, if this situation should occur in the future, it might be easier for IT Scheduling to choose option #1.   By the way, this was an unusual situation – the #FAID_TDCLIENT_CUR_PASSWORD value is normally not changed by Tom prior to requesting FAIDSPWD.    However, the SAIG userid had been suspended and the password was reset manually, thereby requiring an update to #FAID_TDCLIENT_CUR_PASSWORD which should have been done prior to requesting FAIDSPWD_TDCLIENT_CHG_PASSWORD.

Janice.

 

11/18/13.

# 20131118-083134 : pipe_exec                 | cmdout = <Bytes per second: sent 129.7, received 255.3

> 

# 20131118-083134 : pipe_exec                 | cmdout = <debug1: Exit status 107

> 

# 20131118-083134 : *** FATAL ***main::check_status | SEND_TO_CMD (close) [0] failed to execute (107) # 20131118-083134 : *** FATAL ***main::check_status | (100) # 20131118-083134 : *** FATAL ***main::check_status | #****************************************************************************************************

# 20131118-083134 : *** FATAL ***main::check_status | CMDOUT (close) [37880030] failed to execute (100) # 20131118-083134 : *** FATAL ***main::check_status | (100) # 20131118-083134 : *** FATAL ***main::check_status |

 

Looks like we had a connection problem. I re-started and it finished.

David.

 

 

Aborted Module Name:  FAIDSPWD.TDCLIENT_01  

  Date:        Day:      Time:          Resolution:

05/06/10     Thu        08:37          Restarted by David.

 

Error log and follow up comments:

 

Tom called to say he fixed the problem.  Can I just re-set the module?

 

Tom fixed the value in the #FAID_TDCLIENT_CUR_PASSWORD subvar.  However, for the aborted component in backlog, changing the value in the subvar has no impact because the FAIDSPWD.TDCLIENT_01 prompt #3 value had already been populated with the old value from the #FAID_TDCLIENT_CUR_PASSWORD subvar when this component was originally submitted to run.  Resolution of this problem could have been approached either of the following ways:

1)        Delete the chain and request it to run again

2)        Modify the FAIDSPWD.TDCLIENT_01 prompt #3 (current password) in backlog to match the value which Tom had updated to  #FAID_TDCLIENT_CUR_PASSWORD subvar and restart the failed component.

 

David and I chose option #2 to resolve the abort.    However, if this situation should occur in the future, it might be easier for IT Scheduling to choose option #1.   By the way, this was an unusual situation – the #FAID_TDCLIENT_CUR_PASSWORD value is normally not changed by Tom prior to requesting FAIDSPWD.    However, the SAIG userid had been suspended and the password was reset manually, thereby requiring an update to #FAID_TDCLIENT_CUR_PASSWORD which should have been done prior to requesting FAIDSPWD_TDCLIENT_CHG_PASSWORD.

Janice.

 

 

 

 

 

 

Aborted Module Name:  RAMCSYNC.RAMCS001_FA  

  Date:        Day:      Time:          Resolution:

05/11/10     Tue        08:50          See follow up by David below.

 

Error log and follow up comments:

 

08:50:06 SQL> /

old   4:   currterm          varchar2(6)        := '&&report_term';

new   4:   currterm          varchar2(6)        := '201090';

  currterm          varchar2(6)        := '201090';

       *

ERROR at line 4:

ORA-04052: error occurred when looking up remote object

WEBCT.PERSON@ELPROD_WEBCT

ORA-00604: error occurred at recursive SQL level 1

ORA-12519: TNS:no appropriate service handler found

 

08:50:06   3  --*   GET INPUT PARAMETER FOR TERM

08:50:06   4    currterm          varchar2(6)        := '&&report_term';

08:50:06   5 

08:50:06   6 

 

I talked with Kelly and discovered that there is currently a database issue with ELPROD. The database will need to be re-started to resolve the problem, but this may not happen for awhile due to Finals currently underway. RAMCSYNC will fail until this issue is resolved.  

David.

 

 

 

 

Aborted Module Name:   PAGER – ON CALL DELAYS

  Date:        Day:      Time:          Resolution:

05/06/10     Thu        19:15          See follow up below.

 

Error log and follow up comments:

 

I was paged because KFSXAW03_START_KFSX_SCHEDULE had not started yet.  It seemed to be waiting on KFSXSYFY_SYS_FISCAL_YR_MAKER which had an aborted module in it (KFSXSYFY.KFSX_JAVA_01).  I tried to check into why the module aborted and could not figure it out.  So, I called Janice and she said she would take care of it.

Dawn.

 

KFSXSYFY.KFSX_JAVA_01 failed with the following error:

010-05-06 19:15:18,043 [main] INFO

edu.csu.batch.service.RunBatch :: Finished executing job:

KFSXSYFY.fiscalYearMakerStep.4351849.4351850.00 steps:

[fiscalYearMakerStep]

2010-05-06 19:15:18,043 [main] INFO

edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception

(nested)

org.springframework.dao.DataIntegrityViolationException:

OJB operation; SQL []; ORA-02292: integrity constraint

(KFSUSER.GL_ENTRY_TR13) violated - child record found ; nested exception is java.sql.SQLException: ORA-02292:

integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

 

We've already run the "big" run of the KFSXSYFY chain back in April, so this would just be a run to add new entries since that run. 

I think it can wait until in the A.M. to solve, so I deleted the failed component to allow the remainder of the schedule to complete.

Janice.

 

Fiscal Year Maker failed because there are entries in the GL table (4) that have a reversal date beyond 30-jun-2010.

RunBatch ERROR: Exception found:

org.springframework.dao.DataIntegrityViolationException: OJB operation; SQL []; ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found ; nested exception is java.sql.SQLException: ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

 

Caused by: java.sql.SQLException: ORA-02292: integrity constraint (KFSUSER.GL_ENTRY_TR13) violated - child record found

Kevin.

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_03

  Date:        Day:      Time:          Resolution:

05/11/10     Tue        19:27           Restarted by Jan.

 

 

 

 

Caused by:

javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

       com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Rachael.Crnich@ColoState.EDU>... User unknown

       at com.sun.mail.smtp.SMTPTransport.rcptTo(SMTPTransport.java:1196)

       at com.sun.mail.smtp.SMTPTransport.sendMessage(SMTPTransport.java:584)

       at javax.mail.Transport.send0(Transport.java:169)

       at javax.mail.Transport.send(Transport.java:98)

       at org.kuali.rice.kew.mail.Mailer.sendMessage(Mailer.java:150)

       at org.kuali.rice.kew.mail.Mailer.sendMessage(Mailer.java:170)

       at org.kuali.rice.kew.mail.service.impl.DefaultEmailService.sendEmail(DefaultEmailService.java:66)

       ... 176 more

Caused by:

com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Rachael.Crnich@ColoState.EDU>... User unknown

       at com.sun.mail.smtp.SMTPTransport.rcptTo(SMTPTransport.java:1047)

 

Matt,

Can you look into this.  I suspect it is a bad email address.  I thought the program had been changed to trap that error………………..Kevin.

 

 

 

 

Aborted Module Name:   AGENWYWP.AGENS004_01

  Date:        Day:      Time:          Resolution:

05/25/10     Tue        23:02           Deleted by Janice.

06/02/10     Wed       19:14          Deleted by Janice.

 

Error log and follow up comments:

 

05/25/10.

old   8:   outpath         varchar2(255) := '&&utl_path';

new   8:   outpath         varchar2(255) := '/orautl/BANPROD';

old   9:   not_purgeable_file  varchar2(255) := '&&utl_file1';

new   9:   not_purgeable_file  varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file1';

old  10:   error_file   varchar2(255) := '&&utl_file2';

new  10:   error_file   varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file2';

old  11:   purgeable_file   varchar2(255) := '&&utl_file3';

new  11:   purgeable_file   varchar2(255) := 'AGENWYWP.AGENS004_01.utl_file3';

**** Start of AGENS004 05/25/2010 18:00:46

 

ERROR at line 1:

ORA-00001: unique constraint (GENERAL.GOBSRID_KEY_INDEX) violated

ORA-06512: at "BANINST1.ICGOKCOM", line 675

ORA-06512: at "BANINST1.ICSPKLDI", line 468

ORA-06512: at "BANINST1.ICSPKLDI", line 561

ORA-06512: at "SATURN.ST_SPBPERS_AS_LDI", line 5

ORA-04088: error during execution of trigger 'SATURN.ST_SPBPERS_AS_LDI'

ORA-06512: at line 602

 

I do not see why this program aborted.  I am working with Joe Rymski to reduce the number of people being purged at one time. 

So this will run again this evening (Joe said he has worked with IT Scheduling to have this chain run nightly for the next week.

So this abort has been “resolved” for today and we will try again tonight.

Vicki.

 

06/02/10.

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 602

 

18:00:46 SQL> DECLARE

18:00:46   2    -- Constants

 

8:00:46 602    raise;

 

just an FYI, helpful hint

I needed to see the UTL file output in order to determine what the problem was.  The information was not sufficient………………..Vicki.

 

The utl files for banner sql’s live in this directory (if the sql has created a utl file):

 /orautl/BANPROD……………………Jan.

 

 

 

 

 

 

Aborted Module Name:   HRMSSERP_BI.HRMSS221_01

 

  Date:        Day:      Time:          Resolution:

08/13/10     Sat         03:42            Restarted by Janice.

03/22/14     Sat         04:31            Restarted by Robin.

 

Error log and follow up comments:

 

08/13/10.

ERROR at line 680:

ORA-06550: line 680, column 10:

PLS-00103: Encountered the symbol "ELSE" when expecting one of the following:

* & = - + ; < / > at in is mod remainder not rem <an exponent (**)> <> or != or ~= >= <= <> and or like like2

like4 likec between || multiset member submultiset The symbol ";" was substituted for "ELSE" to continue.

ORA-06550: line 682, column 10:

PLS-00103: Encountered the symbol "END" when expecting one of the following:

* & = - + ; < / > at in is mod remainder not rem <an exponent (**)> <> or != or ~= >= <= <> and or like like2

like4 likec between || multiset member

 

FYI - Since this failure was within the "update" portion of the HRMS schedule, the entire "non-update" portion of the HRMS schedule, plus the WHRS schedule, plus the EIDS schedule, plus ODSRHRMS/ODSREIDS refreshes from Friday night and Sunday night are held up waiting for resolution of this HRMS failure. 

Resolution of this failure needs to occur this morning ASAP to allow all these waiting schedules to proceed!

What testing was performed for the recent changes to HRMSS221?  Program was changed 8/4 and last production run was 7/30.

Janice.

 

I found the problem.  I have changed the code and will now get it rolled to production asap.

-Bob-

 

03/22/14.    

Total records for outfile1 = 99

Total records for outfile2 = 0

Total records for outfile3 = 1773

Total records for outfile4 = 11430

Failed: others

-30036 ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SPACE'

ERROR at line 1:

ORA-20000: Failed: -30036 ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SPACE'Possible error getting pay advice date

ORA-06512: at line 1196

 

 

The job finished successfully.

Jeff, I had to kill your work number concurrent program.  The first one finished successfully and it had moved on to the second one which was killed.

Steve. H.

 

 

 

Aborted Module Name:   AGENWYWP.AGENS004_01

  Date:        Day:      Time:          Resolution:

10/08/10     Fri          18:00           Aborted chain deleted by ITS as per Janice.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-01403: no data found

ORA-06512: at line 602

 

From the output utl_file, /orautl/BANPROD/AGENWYWP.AGENS004_01.utl_file1,  we have the following additional information:

-29337790    -Purge A Shanta, Hanan Ali            SPRIDEN record is newly created: -29337790

Program ended unexpectedly at CSU ID -29337790                                 

SQLCODE/SQLERRM =  100, ORA-01403: no data found                               

Persons Not Purged: 1487     

 

And from /orautl/BANPROD/AGENWYWP.AGENS004_01.utl_file3:

829285367 Kim, Lynn N                                                          

Program ended unexpectedly at CSU ID -29337790                                 

SQLCODE/SQLERRM =  100, ORA-01403: no data found                               

Persons Purged: 14144        

Janice.                                                  

 

Since a portion of this chain completed, please just delete the failed AGENS004_01 component and allow the remainder of the chain to complete.

Janice.

 

 

Aborted Module Name:   AREGDYTR.CONVERT_PDFTOPS_01

  Date:        Day:      Time:          Resolution:

06/22/10     Tue        07:10           Restarted by Jan.

 

Error log and follow up comments:

 

 

# ERROR: File not found (/ais01/spool/out/AREGDYTR.AREGR600.4607666.PDF

# Exiting /appworx/csu/exec/CONVERT_PDF_TO_PS.KSH with Return Code (100)

error is 100

 

There is a problem with the report server.  Josh was going to ask Mark Britton to bounce it.  I’m not sure if this is related to this issue or not.

Vicki.

 

The report server is back up and AREGDYTR is complete.

Jan.

 

 

 

 

 

 

 

 

 

Aborted Module Name:   KFSXBCGB.KFSXS034_01

  Date:        Day:      Time:          Resolution:

06/22/10     Tue        22:01           Restarted by ITS.

 

Error log and follow up comments:

 

2:01:04 674  --End Main Program Logic

22:01:04 675  END MAIN_LOGIC;

22:01:04 676  /

old 190: vfiscal_year           ld_csf_tracker_t.univ_fiscal_yr%type := '&&univ_fiscal_year';

new 190: vfiscal_year         ld_csf_tracker_t.univ_fiscal_yr%type := '2010';

 

22:01:04  45  select           distinct

22:01:04  46              --p.position_nbr              position_nbr

22:01:04  47              substr(p.name, 1, 6)     position_nbr

22:01:04  48             ,nvl(a.effective_start_date,

22:01:04  49                          p.effective_start_date)   effdt

22:01:04  50             ,null                                      obj_id

22:01:04  51             ,1                                                           ver_nbr

22:01:04  52             ,substr(j.name,1, 6)       jobcode

22:01:04  53             ,'A'                                         pos_eff_status

22:01:04  54             ,substr(j.name, 1, 30)    descr

22:01:04  55             ,substr(j.name, 1, 10)    descrshort

22:01:04  56             ,'----'                                     business_unit

22:01:04  57             ,nvl((select 'CO-' || o.attribute1

22:01:04  58                           from hr_all_organization_units@hrprod o

 

select    distinct

*

ERROR at line 45:

ORA-06550: line 45, column 1:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 45, column 1:

PL/SQL: SQL Statement ignored

ORA-06550: line 85, column 2:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 85, column 2:

PL/SQL: SQL Statement ignored

ORA-06550: line 140, column 1:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 140, column 1:

PL/SQL: SQL Statement ignored

 

I noticed that this sql has a link to hrprod – should it have any dependencies on HRMS chain(s)?

Janice.

 

 

 

 

 

Aborted Module Name:   ODSRFAMS.ODSRS002_01

  Date:        Day:      Time:          Resolution:

06/22/10     Tue        23:22           See note below.

 

Error log and follow up comments:

 

************************************************/

/*  Running LOAD_FAMIS_DEPT_SPACE_FUNC                 */

/*  Started Tue Jun 22 2010, 23:22:16                       */

/************************************************/

Stage 1: Decoding Parameters

|  location_name=CSUFAMIS_LOCATION

|  task_type=PLSQL

|  task_name=LOAD_FAMIS_DEPT_SPACE_FUNC

Stage 2: Opening Task

declare

*

ERROR at line 1:

ORA-20001: Task not found - Please check the Task Type, Name and Location are

correct.

ORA-06512: at "OWBREP.WB_RT_API_EXEC", line 704

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 41

ORA-06512: at line 216

 

23:22:16 215       WHEN 'REFRESH_FAMIS' THEN

23:22:16 216        csug_run_owb_task('OWBREP', 'CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_DEPT_SPACE_FUNC');

23:22:16 217        csug_run_owb_task('OWBREP', 'CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_EMP_SPACE_DEPT');

23:22:16 218  ELSE

David.

 

Mike,

You will have to double check your syntax, the Task not found error indicates you have something wrong with the location name or load name.

Mark A. Paquette.

 

 

 

 

Aborted Module Name:   AREGCNTB.ODSRS100_01

  Date:        Day:      Time:          Resolution:

06/26/10     Sat        02:53           Restarted by Joleen.

 

Error log and follow up comments:

 

 

02:53:52 258 

02:53:52 259  --*--------------------------------------------------------------------*

02:53:52 260  --************ ADD Records to CUR table from view course_schedule *****

02:53:52 261  --*--------------------------------------------------------------------*

02:53:52 262                           begin <<add_cur3>>

02:53:52 263                                       insert into csus_applicant_cen_cur

02:53:52 264                                       (select * from csus_applicant

02:53:52 265                                        where ltrim(rtrim(term)) = csus_f_cur_term_ods);

02:53:52 266                           end add_cur3;

02:53:52 267                           v_add3_count := SQL%ROWCOUNT;

02:53:52 268 

02:53:52 269            end del_applicant;

02:53:52 270 

 

2:53:52 577  --*    /            -- THIS EXECUTES THE PL/SQL BLOCK STORED IN THE BUFFER

02:53:52 578  --*--------------------------------------------------------------------*

02:53:52 579  .

02:53:52 SQL> /

**** Start of ODSRS100 06/26/2010 02:53:52

begin <<main_block>>

*

ERROR at line 1:

ORA-00001: unique constraint (CSUBAN.CSUS_APPLICANT_CEN_CUR_IX_01) violated

ORA-06512: at line 263

Jan.

 

Bev,

Can you please take a look at why we were not able to create the Census Applicant table Friday night?

I’ll take a look as well.

Vicki.

 

We know what is causing the problem, but are waiting to hear back from an end user (Jordan Fritts) to confirm and to fix the data.

We will wait until this afternoon, if we don’t hear back from Jordan by then, we have a plan on how to proceed.

Vicki.

 

 

 

 

 

Aborted Module Name:   HRMSCHK_QPH.CHECK_WRITER_01

 

  Date:        Day:      Time:          Resolution:

12/28/10     Tue        08:07           See follow up below.

 

Error log and follow up comments:

 

 

We were not able to locate an error.

 

I’m not quite sure what happened with this one – it appears that a DB Error occurred while trying to evaluation AFTER conditions.  Since no checks were produced and the #HRMSCHK_QPH_CHECKS_CH subvar contained the correct “NO_CHECKS” value, I just deleted the HRMSCHK_QPH.CHECK_WRITER_01 component to allow the chain to proceed.  Since #HRMSCHK_QPH_CHECKS_CH=NO_CHECKS, the subsequent HRMSCHK_QPH.HRMSS201_01 and HRMSCHK_QPH.HRMSR218_01 components were then properly skipped.

Janice.

 

 

 

 

Aborted Module Name: KFSXCS52.KFSXS007_01

 

  Date:        Day:      Time:          Resolution:

06/30/10     Wed       23:32           Restarted by Jan.

 

Error log and follow up comments:

 

23:32:20 315 

23:32:20 316            Select count(*) into ws_grp_count

23:32:20 317                        from krim_grp_mbr_t

23:32:20 318                       where MBR_ID = X.KFS_PRNCPL_ID

23:32:20 319                         and MBR_TYP_CD = 'P'

23:32:20 320                         and trunc(sysdate) between nvl(ACTV_FRM_DT,trunc(sysdate)) and nvl(ACTV_TO_DT,trunc(sysdate));

23:32:20 321            ws_grp_names := Null;

23:32:20 322            For xx in empl_grp_cursor (X.KFS_PRNCPL_ID)  Loop

23:32:20 323                       ws_grp_names := ltrim(ws_grp_names||' '||xx.grp_nm);

23:32:20 324            End Loop;

 

Terminated Employee: Selzer,Dan R                          6196    dselzer       INACTIVE                                           CSU Ex-Employee

*                                  cardholder=1

Terminated Employee: Shah,Alisha                            46966   *25156*                   INACTIVE                                           Employee

Terminated Employee: Shetter,David Owen          41910   *17876*                   INACTIVE                                           CSU Ex-Employee

declare

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 323

Jan.

 

I increased the field size from 500 to 1500.  Script is in temp.  Hard to believe someones has that many groups.

Kevin.

 

 

 

 

Aborted Module Name:  KFSXSYCC.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

07/02/10     Fri         16:00           Deleted by Dawn.

 

Error log and follow up comments:

 

+ grep Loading DD /ais02/log/KFSXSYCC.KFSX_JAVA_01.4670272.4670274.00.2010_07_02_1600.log

+ 1> /dev/null

+ sed -n /errtrap_ssh/,$ p /ais02/log/KFSXSYCC.KFSX_JAVA_01.4670272.4670274.00.2010_07_02_1600.log

<#/ais02/job/temp/kfsx_java_ssh.ksh.80#> errtrap_ssh /ais02/job/temp/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.23#> [[ 1 > 0 ]]

<#errtrap_ssh.23#> exit 1

<</ais02/job/prod/kshexe_ssh.74>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

 

We are trying to rebuild production KFS (so it is down). When rebuilding KFS, we cannot run AppWorx jobs (because the Java libraries are missing). Clear Cache is scheduled every 2 hours, so this failed.

This does not need to run. Can someone just cancel this job?

Thanks………………John W.

 

 

 

 

Aborted Module Name:   KFSXBCUD.KFSXS037_01

  Date:        Day:      Time:          Resolution:

07/06/10     Tue         06:02           Restarted by Jan.

 

Error log and follow up comments:

 

ORA-04052: error occurred when looking up remote object

ORA-00604: error occurred at recursive SQL level 1

ORA-12526: TNS:listener: all appropriate instances are in restricted mode

 

Josh,

Is this script okay to restart?

Jan.

 

Yes this script can be restarted.

Josh.

 

 

 

 

 

Aborted Module Name:   WHRSL011.SQLLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

07/09/10     Fri         22:40           Restarted by Jan.

 

Error log and follow up comments:

 

+ print Failure in spawned loader - abort this module

Failure in spawned loader - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

07/09/2010 23:02    JMWILKIN

Just checking on the WHRSL011 (last prv fy loader job) - it failed in the SQLLOAD - looks like the whrs_prv_fy_exphist_00 table definition on ODSPROD does not match the whrs_cur_fy_exphist_00 - missing the SUBACCT column on the whrs_prv_fy_exphist_00.  I'm guessing this was a new column added to the cur_fy and the like change was overlooked for the prv_fy version of the table?  DBA will need to add column, so guess this will have to hang until Monday.

SQL*Loader-466: Column SUBACCT does not exist in table "CSUHR"."WHRS_PRV_FY_EXPHIST_00".

 

WHRSL011.SQLLOAD-LOOP_01 in LOADFAIL status is inhibiting the completion of WHRSAWY1_FYEND_ROLLOVER_TASKS.

Joleen.

 

WHRSL001 is complete with errors:

Record 8257: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8258: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8259: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8260: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8261: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual: 7, maximum: 6)

Record 8262: Rejected - Error on table "CSUHR"."WHRS_PRV_FY_EXPHIST_00", column BENEFIT_ACCT.

ORA-12899: value too large for column "CSUHR"."WHRS_PRV_FY_EXPHIST_00"."BENEFIT_ACCT" (actual:7,maximum:6)

Table "CSUHR"."WHRS_PRV_FY_EXPHIST_00":

  2178 Rows successfully loaded.

  10000 Rows not loaded due to data errors.

  0 Rows not loaded because all WHEN clauses were failed.

  0 Rows not loaded because all fields were null.

Space allocated for bind array:                 255420 bytes(258 rows)

Read   buffer bytes: 1048576

Total logical records skipped:          0

Total logical records read:         12222

Total logical records rejected:     10000

Total logical records discarded:        0

Jan.

 

 

 

 

Aborted Module Name:   HRMSCPR_HRL.HRMSS064_01

 

  Date:        Day:      Time:          Resolution:

03/18/11     Fri          14:18           Restarted by ITS.

 

Error log and follow up comments:

 

Amount Not Distributed: Virgin,Joanna           1450     6467700.      50.00

Frozen Acount: 193563 used 1313850

Frozen Acount: 202887 used 1301320

Frozen Acount: 256481 used 1354280

Frozen Acount: 236808 used 1354280

Frozen Acount: 246103 used 1354280

 

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 961

 

The problem is that the $50.00 amount for Virgin,Joanna is trying to be distributed into the 6467700 account.  This account does not exist in the gl_code_combinations table.

Who is the appropriate person to decide if the incorrect account is being used or who is responsible for inserting this new account?

-Bob-

 

Vickie Schultz added the account to the student's labor schedule (which put it in the gl_code_combinations table).  You should be able to rerun this chain now.

Kevin.

 

 

 

 

Aborted Module Name:   HRMSENCD.HRMSS079_01

  Date:        Day:      Time:          Resolution:

07/27/10     Tue       18:00           See follow up below.

04/26/11     Tue       18:00           See follow up below.

05/29/13     Wed      18:01           See follow up below.

 

Error log and follow up comments:

 

 

07/27/2010 19:45    MUELLER

Steve called about the HRMSS079 failure.  I talked to Diane, Steve and Craig.  Craig found some hung sessions and killed them allowing HRMSS079 to complete and the HRMSENCD chain to continue.

 

07/27/2010 19:45    CPERRY

I got a call from Jan about the Encumbrance process erroring out trying to rebuild an index. The message was about 'Could not acquire resource'.  I looked on kebler and there were a couple of hung forms sessions. I killed these and the process ran.  They must have had a hold on the indexes that were trying to be rebuilt by the Encumbrance job.

 

04/26/2011 18:23    JMWILKIN

I noticed that a page went out for critical job failure in HRMSENCD chain.  It's the error we sometimes see in HRMSS079 with resource busy.  I tried to resubmit HRMSS079, but it failed again with same message.  Might wait a bit and try again - if still no luck, then probably should give DBA a call to just check on the HR database?

Robin just called about the HRMSENCD since she had received the page.  As we chatted about the course of action, I decided to try to restart one more time before we contacted anyone else... and it finished successfully.

 

05/29/2013 20:51    DMCINTOS

HRMSENCD.HRMSS079_01 is in critfail status.  I looked into it.  Had to contact the oncall DBA.  Craig is looking into it. Craig informed me that he found a forms session that had locks on some of the PSP tables.  He killed it.  He asked me to restart the job and I did.  It is running now.

 

 

 

 

 

Aborted Module Name:   HRMSBURS_EM.SEND_MAIL_01

 

  Date:        Day:      Time:          Resolution:

10/29/10       Fri         13:20          See note from Janice below.

 

Error log and follow up comments:

 

 

SEND_MAIL components failed with the following error:

 

SMTP Failed to connect to mail server: A system call received a parameter that is not valid.

 at /appworx/csu/exec/SENDMAIL.PL line 774

error is 255

 

I talked with Elden about this error and we decided to just retry the jobs.  They completed successfully so apparently whatever caused the mail server problem has been corrected.  Elden may pursue some follow-up with ACNS regarding the mail server.

Janice.

 

 

 

 

 

 

Aborted Module Name:   DOITKFSX_RG.DOIT_GET_FILE_01

  Date:        Day:      Time:          Resolution:

01/17/13      Thu       10:54        Restarted by Steve.

 

Error log and follow up comments:

 

 

+ print *** \n*** DOIT_GET_FILE FAILED TO FIND DOIT FILES WITHIN LOOP COUNT MAX TRIES \n***

*** DOIT_GET_FILE FAILED TO FIND DOIT FILES WITHIN LOOP COUNT MAX TRIES

+ exit 100

 

I found references to this same error in our abort log, and in each case the job was restarted.  I restarted this one and it seems to be finding files again now:

 

 

Steve. G.

 

 

 

 

 

 

Aborted Module Name:   AROSDPA3_PAYMENT_APPLICATION_3

  Date:        Day:      Time:          Resolution:

08/16/10     Mon       11:31           Restarted by ITS.

 

Error log and follow up comments:

 

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.ROKODST", line 362

ORA-00001: unique constraint (FAISMGR.RORNCHG_INDEX_01) violated

ORA-06512: at "ODSMGR.ROKODST", line 16

ORA-06512: at line 1

ORA-06512: at "ODSMGR.GOKODST", line 69

ORA-06512: at "TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE", line 24

ORA-04088: error during execution of trigger 'TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE'

ORA-06512: at "BANINST1.DML_TBRACCD", line 68

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1685

ORA-06

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line 3,028

WRN-ERRSTMT: Following statement was last statement parsed:

    begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N

tgrappl terminated with error

0 lines written to /appworx/out/AROSDPA3.TGRAPPL_01.4899213.4899217.00.2041209.lis

 

Jacque/Joe,

Can we restart Application of Payments 3 due to the deadlock error?

Josh.

It is OK to restart………… Jacque Clark.

 

I got this message when I tried to re-start. Please advise.

Starting TGRAPPL (Release 8.1.1.1)

 *                   **WARNING**                         *

*  You cannot submit this job - it is already running.  *

*                                                       *

*  You will also get this message if a previous run of  *

*  this program aborted.  If this is the case, the      *

*  control record for that run must be deleted before   *

*  proceeding. (GJBPRUN record for this jobname with    *

*  a -1 one-up-no).               

David.         

 

The GJBPRUN table contains this record:

TGRAPPL

               -1 01       16-AUG-10

Ctrl rec shows job-in-progress

 

I think we may need to delete this record in order to restart the job:

delete from gjbprun where gjbprun_one_up_no = '-1' and gjbprun_job  = 'TGRAPPL'

Please advise……………Janice.

 

In the past, I remember seeing this error before.  When a job aborts it sometimes puts a record in the gjbprun table with a one_up_number = -1.  In the past we have deleted this record from gjbprun and the process was able to be re-run.

Mike Giebler.

 

OK, I see the -1 number out there.

I will verify that a version is not running.  Have the record removed and restart the job…………….Josh.

 

 

 

            

 

 

 

Aborted Module Name:   WHRSL023.HRMSS020_01

  Date:        Day:      Time:          Resolution:

08/17/10     Tue        22:05           Restarted by Janice.    

 

Error log and follow up comments:

 

08/17/2010 22:26    JLHUTCH

I WAS PAGED AT 10:07 & 10:37 PM ABOUT A DB ERROR ON WHRSL023.HRMSS020_01. CONDITIONS WERE PRESENT WITH TIMING OF "BEFORE" AND PERFORMED OF "DONE".

I CALLED JANICE AND SHE SAID SHE WOULD TAKE A LOOK AT IT.

 

======================================================================

08/17/2010 22:50    JMWILKIN

The DBERROR was one of those wierd situations that we see occasionally where it appears that the so_log column of so_job_queue table for the jobid fills up.  I updated the so_log column with a shorter entry - doublechecked the conditions which had already completed and then restarted the failed component.

 

 

 

 

Aborted Module Name:   HRMSREC_SAL.HRMSS226_01

  Date:        Day:      Time:          Resolution:

09/20/10     Mon      10:24           Restarted by Janice.

Error log and follow up comments:

 

Mon Sep 20 10:24:20 :205 -ERR 171 Database error on 'SUBMIT_OAE_JOB' - ORA-20096: Cannot submit concurrent request for program HRMSS226

 

Check if the concurrent program is registered with Application Object Library.

 

Check if you specified the correct application short name for y

Mon Sep 20 10:24:20 :Error SUBMIT_OAE_JOB failed.

 

Mon Sep 20 10:24:21 MDT 2010

Contents of /appworx/out/o7145818:

Mon Sep 20 10:24:20 :Starting OAE Processing on jobid 5076812

     

     

Mon Sep 20 10:24:23 MDT 2010                                    Page 1

                    Concurrent Program Parameter(s)                  

Parameter               Value                                        

----------------------- ----------------------------------------     

Responsibility          CSU Human Resources Payroll                  

Program App. Short Name CSUH                                         

Job to Run              HRMSS226                                     

salary_end_date         2010/09/30 00:00:00                          

email_listserv          hrsao_campus_rec_dedn@mail.colostate.edu     

utl_path                                                             

utl_file1              

 

The new HRMSS226 program was not registered on hrprod (this is first time to run in production).  Bob took care of that and I restarted the aborted HRMSREC_SAL.HRMSS226_01 chain component, which has now successfully completed.

Janice.

 

 

 

 

 

 

Aborted Module Name:   AREGDYTR.SQLLOAD-LOOP_01

 

  Date:        Day:      Time:          Resolution:

08/26/10     Thu        07:16           See follow up below.

 

Error log and follow up comments:

 

 

+ egrep ABORTED|CRITFAIL

+ grep 4961448

      4961448.00 BATCH     AREGDYTR.SQLLOAD_01 08/26 07:16 00:00:01 ABORTED                AREGDYTR_DAILY_TRANSCRIPT

 

+ print Failure in spawned loader - abort this module

Failure in spawned loader - abort this module

 

With looper scripts, such as  SQLLOAD-LOOP, determination of the problem requires research into what caused the “spawned loader” to fail.  The error message from AREGDYTR.SQLLOAD_01:

 

Record 2: Rejected - Error on table CSUBAN.SWLTNSC, column STREET1.

ORA-12899: value too large for column "CSUBAN"."SWLTNSC"."STREET1" (actual: 63, maximum: 40)

 

Table CSUBAN.SWLTNSC:

  55 Rows successfully loaded.

  1 Row not loaded due to data errors.

 

The data within  “street1” of the bad record was not longer than 40 characters (actually it was only 33 characters):

OFFICE OF THE EDUCATIONAL ATTACH*

However, the last character, although displaying as *, is not really an asterisk character.  The hex value for this character is C9, whereas a true * character has hex value of 2A.  I simply edited the data file, /ais01/ftp/from/user/AREGDYTR.AREG.SWLTNSC.DAT, removed the offending “C9” character at the end of the street1 on record #2, resubmitted the SQLLOAD-LOOP and the newly spawned SQLLOAD component finished successfully.   

 

Perhaps follow-up regarding this weird character should be done with National Student Clearinghouse, from whom this data originated (earlier in the chain).

Janice.

 

 

 

 

Aborted Module Name:   FAIDCFEX_FA.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

06/20/13     Thu         06:18          See note from Joleen below.

 

Error log and follow up comments:

 

#   LOGNAME        : jobprd

#   USER           : jobprd

#   SRC FILE       : /ais01/ftp/to/user/FAIDCFEX_FA.2013_06_20_0600.gpg

#   DST FILE       : cofcsu@ftp.college-assist.org:query/FAIDCFEX_FA.2013_06_20_0600.gpg

#   IDENTITY       : /home/jobprd/.ssh/csu_infosys_prod

#   DIR HOST       :

 

 

I removed the aborted FAIDCFEX jobs. We will run these again when the COF server is available.

Joleen.

 

 

Aborted Module Name:   HRMSDED_SAL.HRMSRPTS-LOOP_01

  Date:        Day:      Time:          Resolution:

09/03/10     Fri           15:06          Restarted by Janice.

 

Error log and follow up comments:

 

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

Could not open server pipe.

======

Print failed! File:/appworx/out/HRMSDED_HRL.HRMSRPTS-LOOP_01.5005908.5005913.00.2010_09_03_1458.AWPROD.LOG Command:PRINTSIZE -d /appworx/out

Fri Sep 3 15:06:00 MDT 2010 Done with BODY

+ exit

retry of sizing successful

Retry on JOB_COMPLETION successful.

 

This was related to the Appworx problem.  I had Craig kill the hrprod report (HRMSR002) which the HRMSRPTS_LOOP had spawned (and was still running) so we could just restart the failed component over again from HRMSR002.

Janice.

 

 

 

 

 

Aborted Module Name:   AROSFRQ1.AROS-PYMTS-LOOP-01

  Date:        Day:      Time:          Resolution:

07/08/10     Fri           17:09          Restarted by Steve.

 

Error log and follow up comments:

 

 

I found past references of this error in our "Abort log" document that indicated we restarted the component.  I did this, and now AROSFRQ1.AROS-PYMTS-LOOP_01 is running again and has spawned another AROSFRQ1.AROS_PYMTS_01.

Steve.

 

 

 

Aborted Module Name:   AREGHRTM_SECTION_ENROLLMENT

  Date:        Day:      Time:          Resolution:

09/07/10     Tue         07:20          Launch errors, see Janice’s note.

 

Error log and follow up comments:

 

AREGHRTM_SECTION_ENROLLMENT is cyclic chain, in that its chain schedules are defined to run hourly, 7 days per week.  Based on the “Scheduled start date” and “Scheduled end date” for the chain schedules, only the AREGHRTM_FA schedule is currently active.  Unfortunately, “Single run” cannot be specified for cyclic chains, as that interferes with multiple schedules (FA/SM/SP) running concurrently.  Consequently, when failures occur, the cycles for a given schedule may back-up – i.e. there may be many iterations of the chain schedule within backlog, each with a failed/LAUNCH ERROR component.  Such was the case with the AREGHRTM_FA schedule of AREGHRTM_SECTION_ENROLLMENT last night – this morning there were nine “stalled” iterations of this chain schedule, each with a chain component in LAUNCH ERROR status.   In this type of situation, it would NOT be appropriate to restart LAUNCH ERROR components within all nine of these iterations at the same time – i.e. we would not want more than one iteration of a specific schedule of this chain running at the same time.  Fortunately, this morning the components simply went back into LAUNCH ERROR status (due to BANPROD database problem) when they were reset multiple times – thereby saving us from the possibility of nine iterations for the same schedule running simultaneously.  The appropriate method for cleanup in such situations would be to analyze status of the failed/stalled iterations of the chain, then delete iterations from backlog as appropriate until only one (or maybe no) iteration remains.  There is no generic rule that can be applied because some chains with many components may require that a failed component be restarted, and the remainder of the chain be allowed to complete – whereas in other cases, it may be appropriate to simply delete the chain component and/or chain – thorough analysis of the situation is always required.  However, in general, we can state that it would not be appropriate to take any action that would cause multiple iterations of the same chain schedule for a cyclic chain to run simultaneously.

 

To provide for more complete chain cleanup for this particular chain, I utilized the following procedure:

1)         Delete the “LAUNCH ERROR” AREGHRTM_FA.AREGS415_01 component

2)        Wait for AREGHRTM_FA. CHAIN_FINISH component and this iteration of AREGHRTM_SECTION_ENROLLMENT to complete

3)        Repeat  steps 1 and 2 for all iterations “backed up” in backlog for AREGHRTM_FA schedule of AREGHRTM_SECTION_ENROLLMENT

 

Note that since the AREGHRTM_FA chain schedule runs hourly, one iteration of the chain has successfully run since the BANPROD recycle was completed this morning.  For this reason, there was no need to finish running any of the leftover iterations in backlog that had LAUNCH ERRORS

Janice.

 

 

Aborted Module Name:   FAIDSAIG_EV.TDCLIENT_01

 

  Date:        Day:      Time:          Resolution:

11/03/10     Wed      00:04           Restarted by Janice.

 

Error log and follow up comments:

 

FROM KEBLER:

Executing Transfer SAIGPORTAL-IDAP10OP                        

-----------------------------------------------

********** Start Communications Session

Connecting to server SAIGPORTAL...

200 Command OK.              

Connected.                                                                

FTP login failed.531 Change password required

Login for UserId: [TG51279] failed     

(531) FTP login failed.531 Change password required

                            

Termination started...                    

Disconnecting...                               

221 Goodbye.                         

********** End Communications Session                

 

As the error message indicates, SAIG was requiring that we change our password.  I incremented the SAIG password by updating #FAID_TDCLIENT_NEW_PASSWORD and requested the password change chain, FAIDSPWD_TDCLIENT_CHG_PASSWORD.   Once this chain completed, I reset the failed TDCLIENT components/modules.

Janice.

 

 

 

 

Aborted Module Name:   HRMSDED_DED.HRMSRPTS-LOOP_01

 

  Date:        Day:      Time:          Resolution:

09/23/10     Thu       09:32           Restarted by Janice.

 

Error log and follow up comments:

 

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 5095654

      5095654.00 HRMS      HRMSDED_DED.HRMSR05009/23 09:32 01:12:42 C-Error     APPWORX    HRMSDED_DEDUCTION_REPORTS

+ print Failure in spawned HRMSR050 - abort HRMSRPTS-LOOP

Failure in spawned HRMSR050 - abort HRMSRPTS-LOOP

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

When looping modules fail, it is helpful to provide the error message which caused the spawned module to fail. 

 In this case, from the HRMSDED_DED.HRMSR050_01 output file:

 

Thu Sep 23 09:32:03 :ORA-01555: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small

Thu Sep 23 09:32:03 : ==> Select /*+ RULE */ asg.assignment_number

Thu Sep 23 09:32:03 :REP-0069: Internal error

Thu Sep 23 09:32:03 :REP-57054: In-process job terminated:Terminated with error:

Thu Sep 23 09:32:03 :REP-300: snapshot too old: rollback segment number 17 with name "_SYSSMU17$" too small

Thu Sep 23 09:32:03 : ==> Select /*+ RULE */ asg.assignment_number

 

We just restarted to try running this report again.

Janice.

 

 

 

Aborted Module Name:   FAIDDLIM_OD.RERIM-LOOP_01

 

  Date:        Day:      Time:          Resolution:

10/06/10     Wed       08:40           See note from Janice below.

 

Error log and follow up comments:

 

5159723.00 BANNER    FAIDDLIM_OD.RERIM11_10/06 08:40 00:00:01 ABORTED                FAIDDLIM_DIRECT_LOAN_IMPORT

+ print Failure in spawned RERIM11 - abort this module

Failure in spawned RERIM11 - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

When we have failures of “looper” scripts, such as RERIM-LOOP, problem resolution requires understanding why the spawned module aborted.  So, whenever RERIM-LOOP (or other looper scripts) issue a message such as:

Failure in spawned RERIM11 - abort this module

Then IT Scheduling should take the next step and also provide feedback regarding the failure within the spawned module, in this case – FAIDDLIM_OD_RERMI11_07 as shown below:

 

                'crbn11op.00_10_01.00_10_02.00_10_05.xml'

                *

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 39, maximum: 30)

 

This will facilitate resolution of the problem by providing all the pertinent information.

Janice.

 

 

 

 

Aborted Module Name:  HRMSREC_SAL.WAIT_FOR_COND_01

  Date:        Day:      Time:          Resolution:

10/19/10     Tue        10:42           Restarted by ITS.

 

Error log and follow up comments:

 

From Operator Log Tab.

 

2010-10-19 10:00:03 Condition #1 inserted by OSU=appworx JDBC Thin Client

2010-10-19 10:00:03 Condition #2 inserted by OSU=appworx JDBC Thin Client

2010-10-19 10:42:35 Action argument from #AW99_{chain_id}=HRMSSAL1 to #AW99_5221916=HRMSSAL1 by OSU=appworx JDBC Thin Client

CON-2010-10-19 10:42:35 Set Subvar

CON-2010-10-19 10:42:47 Abort Task

 

This component failed because its BEFORE condition detected that  the /userfiles/Uhrcrec/data/HRMSREC.CAMPUS.CSUH_CAMPUS_REC_TRANS.DAT file is empty. The BEFORE condition’s action is to ABORT TASK when this file does not exist or is empty.  Apparently, we are not considering an empty feeder file as an acceptable situation.    

Janice.

 

The campus recreation file from campus rec is empty. It aborted. Do you want us to skip the job if it is empty or should there always be data for us to pick up?

Diane.

 

There should always be data to pick up.  I was told that the deadline was the end of business today.  So I was getting ready to transfer the file this afternoon.

Jacqueline Nikolai.

 

The file is out there now.

Diane.

 

 

 

Aborted Module Name:   AGENDYHB.SRRSRIN_01

 

  Date:        Day:      Time:          Resolution:

01/20/11     Thu        19:13           See follow up below.

 

Error log and follow up comments:

 

+ echo Program Failed to execute properly ..... program aborting

Program Failed to execute properly ..... program aborting

+ cat tempout.7813240

old: termout ON

new: termout OFF

one_up_is           

+ grep SRRSRIN.lis /ais01/spool/vplus/parms/vplus_report_bookmark_titles

this_report_title=.Electronic_Prospect_Match

+ vplus_report_bookmark=AGENDYHB.SRRSRIN_01.Electronic_Prospect_Match

+ [[ -s /appworx/out/AGENDYHB.SRRSRIN_01.5668319.5668323.00.2182613.lis

+ ]] [[ ! -s /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis ]]

+ print -n -- \n\fREPORT     :  AGENDYHB.SRRSRIN_01.Electronic_Prospect_Match\n\f

+ 1>> /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis

+ cat /appworx/out/AGENDYHB.SRRSRIN_01.5668319.5668323.00.2182613.lis

+ 1>> /ais01/spool/vplus/temp/AGENDYHB_tempdir/AGENDYHB.SRRSRIN_01.lis

+ exit 1

+ err=1

 

Can you tell me where the input file to SRRSRIN is located?............Vicki.

 

The input file to the SRTLOAD step is /ais01/bkp/AGENDYHB.HRMSS095_01.5668319.BKP. I think that SRRSRIN then processes the data from the SRTLOAD step……………David.

 

 

 

 

 

 

 

Aborted Module Name:   HRMSKFS_QPH.HRMSS175_01

 

  Date:        Day:      Time:          Resolution:

10/21/10     Thu        08:23           Deleted by ITS.

 

Error log and follow up comments:

 

 

8:23:37 1627  utl_file.fclose(out_file1);

08:23:37 1628  utl_file.fclose(out_file2);

08:23:37 1629 

08:23:37 1630  If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

08:23:37 1631        DBMS_OUTPUT.PUT_LINE('###################################################');

08:23:37 1632        DBMS_OUTPUT.PUT_LINE('#####                                                                              #####');

08:23:37 1633        DBMS_OUTPUT.PUT_LINE('####                                                                                          ####');

08:23:37 1634        DBMS_OUTPUT.PUT_LINE('###                    KFS FILE IS OUT OF BALANCE               ###');

08:23:37 1635        DBMS_OUTPUT.PUT_LINE('####                                                                                          ####');

08:23:37 1636        DBMS_OUTPUT.PUT_LINE('#####                                                                              #####');

08:23:37 1637        DBMS_OUTPUT.PUT_LINE('###################################################');

08:23:37 1638  --    RAISE kfs_not_balanced;

08:23:37 1639  End if;

08:23:37 1640 

08:23:37 1640 

08:23:37 1641  DBMS_OUTPUT.PUT_LINE('.');

08:23:37 1642  DBMS_OUTPUT.PUT_LINE

08:23:37 1643       ('**** End   of HRMSS175 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

08:23:37 1644 

08:23:37 1645  Exception

08:23:37 1646      When null_params Then

08:23:37 1647         raise_application_error(-20000, '**** FATAL ERROR - PARAMATER MISSING! ****');

08:23:37 1648      When kfs_not_balanced Then

08:23:37 1649         Rollback;

08:23:37 1650         raise_application_error(-20000, '**** FATAL ERROR - KFS FILE IS OUT OF BALANCE! ****');

08:23:37 1651  END;

08:23:37 1652  /

 

Just wondering why we need to send this email?  The earlier email, with error message and attached joblog, went to the same distribution list as this email…. so these folks would already be aware of the abort and their responsibility to fix it.

Janice.

 

There doesn’t appear to be a problem.  The best we can tell is that it’s an unusual timing issue.  Let’s see what happens after the next run in about 2 hours.

Bob. V.`

 

 

Aborted Module Name:   HRMSQPD_QUICK_PAY_INIT_N_SCHED

  Date:        Day:      Time:          Resolution:

10/22/10     Fri      10:00 – 13:00  See notes below.

 

Error log and follow up comments:

 

We were expecting an Hourly check this morning but haven’t been notified the 10:00 Quick Pays have run yet.  Do you know what is going on?

Viv.

 

There seems to be a problem with today’s HRMSQPD_QUICK_PAY_INIT_N_SCHED chain.  The HRMSCHK_CHECK_PROCESSING_QPH_1 module is in SELF WAIT status, but we can’t figure out why.  This appears to have stalled the chain.  Please advise,  Thanks...  Steve. G.

 

It looks like the HRMSCHK_CHECK_PROCESSING chain is currently running in HRMSSAL, so HRMSQPD is waiting for the HRMSSAL version to finish.

David.

 

Thanks, David.  We’ll make a note of that possible scenario for the future.

 

Vivian,

See David’s reply above.  I don’t know how long it may take for the HRMSSAL version of HRMSCHK_CHECK_PROCESSING to finish, but apparently the quick pay version will not run until it is done.  It’s possible that by chance we just have never had this timing issue before when running Salary Phase 4.  We’ll keep an eye on it.

Steve. G.

 

Kicked back in after 13:05 and HRMSQPD_QUICK_PAY_INIT_N_SCHED is running once more.

Dermot.

 

 

 

 

Aborted Module Name:   AREGDYWL_FA.VPLUS_RCAP-LOOP_01

  Date:        Day:      Time:          Resolution:

10/24/10     Sun        23:01           See note from Janice below.

 

Error log and follow up comments:

 

 

+ awexe jh

+ egrep ABORTED|CRITFAIL

+ grep 5250863

      5250863.00 BATCH     AREGDYWL_FA.VPLUS_RC10/24 23:01 00:00:00 ABORTED                AREGDYWL_STOP_WAIT_LIST_NOTIFY

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

As this error indicates, the spawned module aborted.  Therefore, to determine the cause of the error it is important to view the error message from the joblog for the spawned AREGDYWL_FA.VPLUS_RCAPTURE_01 aborted module:

 

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu  on port: 7980

+ exit 4

error is 4

 

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

Janice.

 

 

 

Aborted Module Name:   OSYSJOBS_05.OSYSLLNK_01

  Date:        Day:      Time:          Resolution:

10/31/10     Sun         16:30           See note from Janice below.

 

Error log and follow up comments:

 

 

Remote Shell errtrap_rsh parm 2 value is 3 <<errtrap_rsh.3>> [[ 3 > 0 ]] <<errtrap_rsh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=3 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=3

***

<<errtrap_rsh.7>> exit 3

+ grep SCRIPT ABORTED

+ /ais02/log/OSYSJOBS_05.OSYSLLNK_01.5287184.5287190.00.2010_10_31_1630.

+ log

+ 1> /dev/null

+ + cut -f 2 -d =

+ grep ^*** ERROR:

+ /ais02/log/OSYSJOBS_05.OSYSLLNK_01.5287184.5287190.00.2010_10_31_1630.

+ log

+ grep SCRIPT ABORTED

rsh_return_code=3

+ print *** \n*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=3 

+ \n*** EXIT  WITH EXIT CODE=3  \n***

***

*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=3

*** EXIT  WITH EXIT CODE=3

 

When the errtrap routine is invoked, this means that there was a non-zero return code for the last command which executed.  Therefore, it is important to include that last command, along with an associated error messages for troubleshooting the problem.  In this case:

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_ban_too_old is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_slave_files is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_15.OSYSPURG_01.5287288.5287292.00_too_old is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 3 Remote Shell errtrap_rsh parm 2 value is 3

 

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation, which I did for the two OSYSJOBS this morning and they have successfully completed.

Janice.

 

 

 

Aborted Module Name:   AREGDYWL_FA.VPLUS_RCAP-LOOP_01

  Date:        Day:      Time:          Resolution:

04/14/13     Sun         01:02          Restarted by Elden.

 

Error log and follow up comments:

 

04/14/2013 01:20    EFLICK

AREGDYWL_FA.VPLUS_RCAP-LOOP failed with VistaPlus network error.  Confirmed report was not captured to VistaPlus.

Tried resetting it, but it aborted again.  Need to try again after the hierarchy builder runs and refreshes Vista Plus.

 

 

 

 

Aborted Module Name:   AREGDYWL_SP.VPLUS_RCAP-LOOP_01

 

  Date:        Day:      Time:          Resolution:

11/07/10     Sun       23:01           See note from Janice below.

01/02/10     Sun       23:02           No follow up received.

 

Error log and follow up comments:

 

11/07/10.    

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu  on port: 7980

As this error indicates, the spawned module aborted.  Therefore, to determine the cause of the error it is important to view the error message from the joblog for the spawned AREGDYWL_FA.VPLUS_RCAPTURE_01 aborted module:

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu  on port: 7980

+ exit 4

error is 4

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

Janice.

 

01/02/11.

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print AREGDYWL_SP.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

 

While it was fine in this case to reset AREGDYWL_SP.VPLUS_RCAP-LOOP_01, it is important to verify why the module actually aborted. The error from AREGDYWL_SP.VPLUS_RCAP-LOOP_01:

+ print Failure in spawned VPLUS_RCAP - abort this module Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

So appropriate follow-up would need to include verifying why the spawned module (AREGDYWL_SP.VPLUS_RCAPTURE_01) failed:

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_SP -m -ips /ais01/spool/vplus/out/VPC.938276/AREGDYWL_SP.5583184.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu  on port: 7980

+ exit 4    error is 4

From the original email, it does not appear that the log for the spawned AREGDYWL_SP.VPLUS_RCAPTURE_01 module was examined. While the end result of resetting the AREGDYWL_SP.VPLUS_RCAP-LOOP_01 was the appropriate course of action, that was only the case because of the particular reason that the spawned AREGDYWL_SP.VPLUS_RCAPTURE_01 failed. 

Janice.

 

 

 

 

 

Aborted Module Name:   AREGDYTR.CONVERT_PDFTOPS_01

  Date:        Day:      Time:          Resolution:

11/10/10     Wed       07:15           See note from Janice below.

12/01/10     Wed       07:18           No follow up received.

 

Error log and follow up comments:

 

11/10/10.

 

Due to the error in the AREGR600, there was no output PDF file from AREGR600 to be used as input to the CONVERT_PDFTOPS component.  I deleted this failed component, as well as the subsequent SPOOL_FILTER component which would have spooled the “postscript” output from the CONVERT_PDFTOPS component.

 

*** Oracle Report:  AREGR600

    Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGDYTR_DAILY_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

    req_levl=AL

    p_source=B

    PDFEMBED=YES

*** Report Errors:

    REP-0177: Error while running in remote server

    Unable to retrieve a string from the Report Builder message file.

    REP--002:

Janice.

 

12/01/10.

 

The before Condition Details indicate it is checking for a file that does not exist:

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

 

*** Oracle Report:  AREGR600

    Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGORTR_ONREQ_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

    p_reprint_date=01-DEC-2010

    req_levl=AL

    p_source=R

 

*** Report Errors:

    REP-0177: Error while running in remote server

    Unable to connect to the specified database.

 

 

 

 

 

 

 

Module still running:   AGENDYGN.AGENS024_01

  Date:        Day:      Time:          Resolution:

11/12/10     Fri          09:00           See note from Janice below.

 

Error log and follow up comments:

 

AGENDYGN_DAILY_GENERAL is still running from last night’s schedule, Normal run time is 2 – 4 hours.

Module AGENDYGN.AGENS024_01 is the step that is currently running which normally takes less than a minute to complete.

Dermot.

 

I consulted with Mark Britton regarding this – there seems to be some stall  with the link being used in AGENS024.sql between BANPROD and HRPROD.  I killed the Appworx AGENDYGN.AGENS024_01 component – Mark will do cleanup on the databases to ensure that the associated processes are no longer running.  Once that cleanup has been completed, we’ll restart the KILLED AGENDYGN.AGENS024_01 component.

 

The problem is happening again so I killed the Appworx AGENDYGN.AGENS024_01 component. 

DBA’s are looking into the problem, which may take a while to resolve.

Janice.

 

If you killed this job in Appworx, I will kill the sessions in the hrprod database. 

They are still running in there.

Craig.

 

Yes – it has been killed in Appworx………Janice

 

Mark modified the AGENS024.sql script to work-around the problem.  I restarted the KILLED AGENDYGN.AGENS024_01 component and it successfully completed in 47 seconds!

IT Scheduling should flag /ais01/src/sql/temp/AGENS024.sql as okay through the end of November.

 As follow-up,  we’ll need a Clarity incident to document the AGENS024.sql change/add modlog entry to sql/etc. and migrate it into production.

Janice.

 

 

 

 

 

Aborted Module Name:   HRMSS041.HRMSS041_01

  Date:        Day:      Time:          Resolution:

11/12/10     Fri          20:40           See note from Bob below.

 

Error log and follow up comments:

 

 

Error: Mandatory Element missing

User-Defined Exception

ORA-06510: PL/SQL: unhandled user-defined exception

 

ORA-06512: at line 627

20:40:53 627                            l_segment := csuh_edi_834_pkg.edi_ins(

declare

*

ERROR at line 1:

ORA-20000: **** FATAL ERROR! ****

ORA-06512: at line 798***

 

In researching, I see that the error stems from the fact that “Jones, Kelley” does not have a contact_type_code.  This code is derived from taking the contact_type (in the per_contact_relationships table) and decoding it to some other value.  In this case the contact_type is “P” and it is not decoded to anything.

To resolve this issue, it needs to be determined what “P” (contact_type) should be decoded to for a contact_type_code and this added to the csr_dep cursor.

-Bob-

 

So who needs to make this determination?

Janice.

 

Chris D. was notified.  We’re waiting for her decision.

Thanks,  -Bob-

 

The data that caused this problem has been changed.

 

Please rerun the HRMSS041 script.  -Bob-

 

 

 

Aborted Module Name:   HRMSMLV_HRL.CHAIN_FINISH_01

 

  Date:        Day:      Time:          Resolution:

11/15/10     Mon        09:04           See note from Janice below.

 

Error log and follow up comments:

 

 

+ print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE

+ FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/HRMSMLV_HRL.CHAIN_FINISH_01.5364967.5364972.00.2010_11_15

+ _0904.AWPROD.LOG print *** \n*** END SEARCH OF JOBLOG FOR ERROR

+ STRINGS \n***

+ 1>> /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

+ cat /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01.PREFIX.spool.lis.

cat: 0652-050 Cannot open /ais01/dat/work/prod/HRMSMLV_HRL.CHAIN_FINISH_01.PREFIX.spool.lis.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

 

This was a timing issue with the HRMSMLV_HRL.CHAIN_FINISH and a spawned OAE_FEEDBACK sub_chain CHAIN_FINISH.  I've modified the global PREFIX script to uniquely qualify the PREFIX generated spool.lis file with the jobid.  This should prevent such timing issues in the future.  The component was restarted and completed successfully.

Janice.

 

 

 

Aborted Module Name:   APMXMISC.APMXLOGP_01

  Date:        Day:      Time:          Resolution:

11/15/10     Mon        13:00           See note from Janice below.

11/22/10     Mon       10:41            See note from David below.

 

Error log and follow up comments:

 

11/15/10.

rm: Removing ./KFSXGLSC_D1.5330031.txt

rm: Removing ./KFSXGLSC_D2.5330303.txt

rm: Removing ./KFSXGLSF.5330163.txt

rm: Removing ./KFSXPDDR.KFSX_JAVA_01.5326775.pdf

rm: Removing ./KFSXPDGL.5330182.txt

rm: 0653-603 Cannot remove directory ./VPC.2318484.

rm: Removing ./VPC.2318484/AREGDYWL_SP.5325961.txt.ps

+ exit 2

 

The bulk of time required to run the APWXMISC.APWXLOGP_01 component is for the joblog cleanup.  The process to determine which joblogs to delete is complex since the analysis must be done to retain 14 days and/or latest 5 generations for each chain schedule of each chain.  Historically, it takes about 45 minutes to run this portion of the APWXLOGP job.  Whenever restarting a failed APWXMISC.APWXLOGP_01 component, the "Skip joblog cleanup" prompt for this component should be changed to Y **IF** the failure occurred after the completion of the joblog cleanup. 

 

As an example, in this failure, the joblog indicates removal of joblog files had begun and completed:

***  11/15/2010-13:47:01

*** REMOVE /ais01/joblog FILES

....  feedback for various joblog removals

***  11/15/2010-13:47:05

*** END   COMMON CLEANUP FOR APPWORX AWPROD AGENT

The failure occurred subsequent to the joblog cleanup... in the section to remove /ais01/spool/vplus/out files:

***  11/15/2010-13:47:07

*** REMOVE /ais01/spool/vplus/out FILES

 

So, in this case, the restart should have been done with the "Skip joblog cleanup" prompt to Y - which would have skipped this time intensive joblog analysis/removal process.

By the way, the "Skip joblog cleanup" prompt change to "Y" would need to take place in **BACKLOG** prior to restarting the failed APWXMISC.APWXLOGP_01 component.

Janice.

 

11/22/10.    

We were trying to figure out this abort.  Janice sent an e-mail on 11/15/10 about a similar abort She mentioned that if the clean up process had already started, then a prompt flag would have to be changed before re-starting this module.  We could not find any place that indicated that the clean up process had started.  Did you have to change the prompt flag before restarting this module? 

We are just trying to learn.

 

Yes,

I modified the ‘Skip joblog cleanup’ from N to Y.  I did this because the failure occurred after the ‘COMMON’ purges had completed. I will have IT scheduling perform this next time this happens. It seems to happen quite often.  

David.

 

 

 

 

 

Aborted Module Name:   FAIDCFAT_FA.GLBDATA_01

  Date:        Day:      Time:          Resolution:

11/16/10     Tue         06:00           See follow up below.

 

Error log and follow up comments:

 

+ FAIDCFAT_SP.GLBDATA_01 + FAIDDYNT_EV.GLBDATA_03 + FAIDTRAK_EV.GLBDATA-LOOP_01

 

*ERROR* DURING PREPARE PARM2...ABORTING                                                                                            

  SQLCODE = 0942                                                                                                                   

SQL ERROR = ORA-00942: table or view does not exist

                                                                               

X01 ROLLBACK SQLCODE=0000                                                                                                          

X01 COMMIT (1) SQLCODE=0000                                                                                                        

SQLCODE = 0000                                                                                                                     

ORA-01403: no data found                                                                                                           

DQY-ABORT ROLLBACK SQLCODE = 0000

 

Tom Biedscheid is aware of the various GLBDATA failures - it has to do with the Banner Financial Aid upgrade over the weekend.  IT Scheduling doesn't need to send the joblogs/email for all of these GLBDATA failures - pretty sure if we find the problem with one, it will fix all...

 

Here’s the Control Report from FAIDTRAK – the other failures look similar:

SUNGARD HIGHER EDUCATION                                                     

                                                     POPULATION SELECTION EXTRACT                                                  

                                                          CONTROL REPORT                                               PAGE       1

                                                                                                                                    

              Start Time: 16-NOV-2010 00:24:17                                                                                     

         GLBDATA Version: 8.3.0.5                                                                                                   

          Selection ID 1: FAIDTRAK_EV_TRACK_GROUP                                                                                  

             Application: FINAID                                                                                                    

              Creator ID: FAUSER                                                                                                   

                                                                                                                                    

*ERROR* DURING PREPARE PARM2...ABORTING                                                                                            

  SQLCODE = 0942                                                                                                                    

SQL ERROR = ORA-00942: table or view does not exist

                                                                               

X01 ROLLBACK SQLCODE=0000                                                                                                           

X01 COMMIT (1) SQLCODE=0000                                                                                                        

SQLCODE = 0000                                                                                                                      

ORA-01403: no data found                                                                                                           

DQY-ABORT ROLLBACK SQLCODE = 0000       

Janice.

 

 

 

Aborted Module Name:   OSYSJOBS_04.OSYSLLNK_01

  Date:        Day:      Time:          Resolution:

11/21/10     Sun         16:30           Restarted by Janice.

 

 

Error log and follow up comments:

 

+ grep ^*** ERROR: /ais02/log/OSYSJOBS_04.OSYSLLNK_01.5398240.5398246.00.2010_11_21_1630.log

+ grep SCRIPT ABORTED

rsh_return_code=1

+ print *** \n*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=1  \n*** EXIT  WITH EXIT CODE=1  \n***

***

*** RSH EXECUTED SCRIPT sys_llnk_rsh.ksh EXIT CODE=1 

*** EXIT  WITH EXIT CODE=1 

 

I'm forwarding this email which I sent early in November regarding a couple OSYSLLNK failures.  Please review this email for instructions on troubleshooting OSYSLLNK failures and as was indicated near the end of the email:

 

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation,....

 

Based on the previously communicated information, IT Scheduling could have just restarted the two failed OSYSLLNK components this morning because they both failed with similar 'The status on "filename here" is not valid' errors.

 

When the errtrap routine is invoked, this means that there was a non-zero return code for the last command which executed.  Therefore, it is important to include that last command, along with an associated error messages for troubleshooting the problem.  In this case:

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_ban_too_old is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_14.OSYSPURG_01.5287280.5287284.00_slave_files is not valid.

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_15.OSYSPURG_01.5287288.5287292.00_too_old is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 3 Remote Shell errtrap_rsh parm 2 value is 3

 

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation, which I did for the two OSYSJOBS this morning and they have successfully completed.

Janice.

 

I did a quick check on this, I think the error was a little farther back in the error log.

Check out the find command which produced the error.

I believe we get these once in a while on temp being created and deleted while cleanup jobs are run.

Rich.

<</ais02/job/prod/kshexe_rsh.70>> sys_llnk_rsh.ksh

<#/ais02/job/prod/sys_llnk_rsh.ksh.23#> alias log=echo "*** " $(date +%m/%d/%Y-%T)

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /app/dars/darsprod/dars35/bin/temp/dumpww10110712281616 is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 1

 

 

 

Aborted Module Name:   HRMSKFS_QPH.HRMSS175_01

 

  Date:        Day:      Time:          Resolution:

11/29/10     Mon       08:25          Restarted by Janice.

 

Error log and follow up comments:

 

 

08:25:35 1627  utl_file.fclose(out_file1);

08:25:35 1628  utl_file.fclose(out_file2);

08:25:35 1629 

08:25:35 1630  If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

08:25:35 1631            DBMS_OUTPUT.PUT_LINE('###################################################');

08:25:35 1632            DBMS_OUTPUT.PUT_LINE('#####                                         #####');

08:25:35 1633            DBMS_OUTPUT.PUT_LINE('####                                           ####');

08:25:35 1634            DBMS_OUTPUT.PUT_LINE('###        KFS FILE IS OUT OF BALANCE ###');

08:25:35 1635            DBMS_OUTPUT.PUT_LINE('####                                           ####');

08:25:35 1636            DBMS_OUTPUT.PUT_LINE('#####                                         #####');

08:25:35 1637            DBMS_OUTPUT.PUT_LINE('###################################################');

08:25:35 1638  --    RAISE kfs_not_balanced;

08:25:35 1639  End if;

08:25:35 1640 

08:25:35 1641  DBMS_OUTPUT.PUT_LINE('.');

08:25:35 1642  DBMS_OUTPUT.PUT_LINE

 

This appeared to be a timing issue.  HRMSS175 did **NOT** fail with an out of balance error.  The section of the joblog included in this email is simply a section of the sql logic, but this logic was not invoked by HRMSS175 during this execution.  In fact, HRMSS175 did not fail at all but rather the SUCCESS.HRMSS175 logic detected that the utl_file2 (.recon file) was empty and consequently set the hrmss175_status_{chain_id} subvar accordingly:

+ hrmss175_status=ERROR - HRMSS175 output file(s) empty/missing [[ ERROR

+ - HRMSS175 output file(s) empty/missing != SUCCESSFUL ]] print \n***

+ HRMSS175 NOT SUCCESSFUL - ABORT \n***

+ 1> /ais01/dat/work/prod/HRMSKFS_QPH.HRMSS175_01.Job_Error_Summary

+ print ERROR - HRMSS175 output file(s) empty/missing

+ 1>> /ais01/dat/work/prod/HRMSKFS_QPH.HRMSS175_01.Job_Error_Summary

+ [[ -s  ]]

+ echo END OF SUCCESS.HRMSS175

END OF SUCCESS.HRMSS175

The HRMSS175 chain component actually goes into ABORT status when the after condition checks the value in {#hrmss175_status_{chain_id}} and aborts the task if {#hrmss175_status_{chain_id}} not equal SUCCESSFUL.

So, there was a timing issue because the output utl_file2 was actually not empty -- but at the time that the SUCCESS.HRMSS175 script checked, it was empty.  Should this problem recur frequently, we may want to include a slight delay within the SUCCESS.HRMSS175 script (via a sleep command) to attempt to eliminate timing issues.

 

I restarted the failed HRMSS175 and it completed okay.

Janice.

 

 

 

Aborted Module Name:   AROSDGLI.AROSS167

 

  Date:        Day:      Time:          Resolution:

11/30/10     Tue        21:00            See correct error from David below.

 

Error log and follow up comments:

 

 

*** /appworx/out/FAIDINST_NW.LYNX_01.status.txt ***

   URL=http://wsprod.colostate.edu/cwis231/onet/autorun/triple_crown_schols.aspx (GET)

STATUS=HTTP/1.1 200 OK

   URL=http://wsprod.colostate.edu/cwis231/onet/autorun/triple_crown_schols.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

 

Here is some more information regarding the FAIDINST.LNYX_01 failure:

 

    </head>^M

^M

    <body bgcolor="white">^M

^M

            <span><H1>Server Error in '/CWIS231/onet' Application.<hr width=100% size=1 color=silver></H1>^M ^M

            <h2> <i>ORA-20100: ::Cannot create, record already exists::<br>ORA-06512: at &quot;BANINST1.CSUG_API_GL BEXTR&quot;, line 287<br>ORA-06512: at line 1<br></i> </h2></span>^M ^M

            <font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">^M ^M

            <b> Description: </b>An unhandled exception occurred during the execution of the current web request. P lease review the stack trace for more information about the error and where it originated in the code.^M ^M

            <br><br>^M

^M

            <b> Exception Details: </b>System.Data.OracleClient.OracleException: ORA-20100: ::Cannot create, record  already exists::<br>ORA-06512: at &quot;BANINST1.CSUG_API_GLBEXTR&quot;, line 287<br>ORA-06512: at line 1<br><br><

br>^M

^M

            <b>Source Error:</b> <br><br>^M ^M

David.

 

 

 

 

 

Aborted Module Name:   INTLDALY.INTLS005_01

 

  Date:        Day:      Time:          Resolution:

09/14/11     Wed      19:08            Restarted by Joleen.

Error log and follow up comments:

 

ERROR at line 1:

ORA-20001: Bad data for PIDM: 10643397-1843 -ERROR- ORA-01843: not a valid month

ORA-06512: at line 292

 

The bad date was corrected so the INTLS005 module can be run again.

Peter.

 

 

 

 

Aborted Module Name:   AREGORFP_FA.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

12/08/10     Wed       13:06            See follow up from Vicki.

 

Error log and follow up comments:

 

 

#------------------------------------------------------------------------------

# # ADDRESS FILE [to:/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST]

#******************************************************************************

# FATAL : < main::validate_address

# FATAL : Error opening address file (/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST) : A file or directory in the path name does not exist.

#------------------------------------------------------------------------------

# [ 2010.12.08-13:06:54 ]

# RETURN CODE = 100

#==============================================================================

error is 100

===== Exiting PERL_CSU =====

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

The thing that jumps out to me is ADMSS425.  This chain is not executing Admission information.

# ADDRESS FILE [to:/ais01/dat/misc/mailst/SEND_MAIL.ADMSS425.LST]

 

Jerry Becker wrote the following:

The distribution list the chain should send the report (comma delimited file) to is ro_rpt_processing@mail.colostate.edu.

Vicki.

 

 

 

 

 

Aborted Module Name:   ADMSSRLD_FR.SQLSURLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

12/30/10     Thu         22:49           See follow up below.

 

 

Error log and follow up comments:

 

+ rm -ef /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.surload_driver /ais01/dat/work/prod/d5575457

rm: removing /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.surload_driver

rm: removing /ais01/dat/work/prod/d5575457

+ read this_sqlsurload_letter

+ let iteration_count=5+1

+ (( 6 < 10 ))

+ iteration_cnt=06

+ grep END SQL FOR LETTER_CODE=AGEN_AGEN_PTRFA /ais01/dat/work/prod/ADMSSRLD_FR.SQLSURLOAD-LOOP_01.DAT

+ print END SQL line not found on driver for AGEN_AGEN_PTRFA - abort this module

END SQL line not found on driver for AGEN_AGEN_PTRFA - abort this module

+ exit 1

+ err=1

 

I have corrected the error in the driver file so that we're good to go next time.  I'll talk to Joe about whether or not we want to run this Monday morning.

Kathy.

I modified the “work” version of the driver with the corresponding change

--* BEGIN SQL FOR LETTER_CODE=AGEN_AGEN_PTRFA       

TO

--* BEGIN SQL FOR LETTER_CODE=AGEN_PTRFA

so that the failed SQLSURLOAD-LOOP component could be restarted, picking up where it left off,  thereby allowing for completion of the ADMS schedule today instead of waiting until Monday J

Janice.

 

Thanks, Janice.  I just corrected it in the file on Kebler...but I also made another change at the end of the file.  We had two identical pieces of SQL for two different letter codes so I removed the duplicate  code and added the letter code to the first one, separated with a comma:

--* BEGIN SQL FOR LETTER_CODE=AENR_LHPFR,ASPY_LHPFR

--* END SQL FOR LETTER_CODE=AENR_LHPFR,ASPY_LHPFR

I'm just letting you know, in case you notice the difference.  It won't matter if we use the work version that has the two separate sql's because I added the ASPY_LHPFR manually to Banner this morning.  Since the AENR_LHPFR code did run and generate letters, I wanted to make sure they stayed in sync.  Since the letters are already out there, the code that runs shouldn't pick up anyone new to add.   But I'll check it later just to make sure.

Sorry for the error.  I did a lot of cut-and-pasting yesterday and even though I checked it several times for typos like this, I still missed it. :-(  Sorry to make you work on a holiday!

Kathy.

When I did a “diff” between the files, the only change it detected was the one I noted – so, that’s the only one I changed in the “work” version of the driver which was being used by this step.   Guess you must have slipped this change in after I looked!

Your schedule is progressing now – only ADMSLETS and ADMSEMAL left running.

Have a Happy New Year!

Janice.

 

 

 

 

 

 

 

Aborted Module Name:  KFSXGLPO_D1.KFSX_JAVA_02

  Date:        Day:      Time:          Resolution:

09/29/11     Thu       20:02           See follow up below.

 

Error log and follow up comments:

 

 

at $Proxy210.post(Unknown Source)

                at org.kuali.kfs.gl.batch.service.impl.PosterServiceImpl.postTransaction(PosterServiceImpl.java:433)

                ... 46 more

Caused by:

java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."GL_ENTRY_T"."TRN_LDGR_ENTR_DESC" (actual: 41, maximum: 40)

                at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

 

Dermot called having problems re-starting KFSXGLPO_D1.KFSX_JAVA_02. The chain was aborting by design.

The chain expects the re-start file to be located as /ais02/app/kfs/prd/work/staging/gl/originEntry/gl_sortpos

t.restart.data. I copied the backup of the gl_sortpost.data to the appropriate file name. I used the ABORT Chain notes as a guideline to do this. Then John was able to fix the bad records and KFSXGLPO_D1.KFSX_JAVA_02 was re-started and completed successfully.

David.

 

Poster bombed for at least the 3rd time for the exact same problem. I was disappointed that we needed to involve David last night. I have updated the spreadsheet, but I’d like to revisit the problem, so that we document it thoroughly. I truly believe that we do not need to involve Janice or David P next time. Let’s discuss this next week.

John.

 

If this sernario happens again then the following step needs to be taken:

COPY

/ais01/bkp/{#1}.KFSX_JAVA_01.gl_sortpost.data.{#apmx_now_yyyymmdd_time}.bkp

To

/ais02/app/kfs/prd/work/staging/gl/originEntry/gl_sortpost.restart.data

Then restart ABORTED module when odd characters are removed.

 

 

 

Aborted Module Name:   AREGDYWL_SP.VPLUS_RCAP-LOOP_01

  Date:        Day:      Time:          Resolution:

01/09/11     Sun        23:02           Restarted by Steve.

01/06/13     Sun        00:02           Restarted by Joleen.

 

Error log and follow up comments:

 

01/09/11.    

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print AREGDYWL_SP.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/AREGDYWL_SP.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

 

As this error indicates, the spawned module aborted.  Therefore, to determine the cause of the error it is important to view the error message from the joblog for the spawned AREGDYWL_FA.VPLUS_RCAPTURE_01 aborted module:

 

+ /vplus/rcapture -hvplusprod.is.colostate.edu -dAREGDYWL_FA -m -ips

+ /ais01/spool/vplus/out/VPC.561472/AREGDYWL_FA.5250837.txt.ps

FATAL: Cannot connect to Host: vplusprod.is.colostate.edu  on port: 7980

+ exit 4 error is 4

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

 

Please proceed by checking the spawned output joblog (AREGDYWL_SP.VPLUS_RCAPTURE_01).  If the problem there is the "cannot connect to Host" error message, which I suspect it will be, then simply restart the VPLUS_RCAP-LOOP component as described from this older email dated back in October. 

Janice.

 

01/06/2013.

+ egrep ABORTED|CRITFAIL|C-Error

      9643375.00 BATCH     AREGDYWL_SP.VPLUS_RC01/05 23:01

00:00:02 ABORTED

AREGDYWL_STOP_WAIT_LIST_NOTIFY

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

There are no conditions. Looked this up in Dermots ABORT log. This Process Flow has aborted before with the same error, there was a note from Janice-

This error was simply a problem connecting to vplusprod - I restarted the AREGDYWL_FA.VPLUS_RCAP-LOOP_01 component so it would re-spawn the VPLUS_RCAPTURE module.

I restarted this job and it has finished running.

Joleen.

 

 

 

Aborted Module Name:   AROSMSTM.AROSS303_01

  Date:        Day:      Time:          Resolution:

01/15/11     Sat        10:00           See note from Janice & Josh below.

 

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

ORA-03114: not connected to ORACLE

ORA-03114: not connected to ORACLE

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

I restarted AROSMSTM.AROSS303_01, but have placed the next component (AROSMSTM.AROSS305_01) on hold so results of AROSS303 can be verified before we allow the remainder of the chain to proceed. 

 

IT Scheduling: 

Please monitor AROSMSTM.AROSS303_01 and reply all to this email regarding AROSS303 completion/failure.

 

There is a problem with the statement generation program, coupled with the massive number of transactions to be processed.  We think statement generation, using the existing process, could take DAYS!  Josh is working on a re-write of statement generation - a project that has been in the works for a while, but due to the current performance issue has now become a high priority project (like we need it yesterday!).

Josh may be able to provide additional info regarding estimated timeframe.

Janice.

 

We are working to implement a new statement process.

We are hoping to have statements printing as soon as possible which will probably be tomorrow or Friday.

I will give you more information when I have it.

Josh.

 

 

 

 

 

Aborted Module Name:   AREGDYDL.FTPS_CURL_01

  Date:        Day:      Time:          Resolution:

01/18/11     Tue        18:05           Restarted by ITS.

 

Error log and follow up comments:

 

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# 2011.01.18-18:05:08  : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

error is 100

===== Exiting SCRIPT_MS_CSU =====

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

Jerry is going to reset the password on the State Driver's License side. 

Can anyone tell me the name of the job that sends Jerry a  monthly email reminder to reset the password.  He does not remember receiving one.

Once Jerry has reset the password we can continue with this job

Vicki.

 

Can you add ro_aries_security@mail.colostate.edu to the distribution list?  I think what happened is the email went to the CODoR to reset our password, but they don't know what our password is.  Matt, Denise or I may have to follow-up this email with another email or phone call containing our password.  Or can Appworx be set-up to send a second email following this one containing just the current password?

I think there should also be one that goes out quarterly to change our password - would you mind checking to make sure ro_aries_security@mail.colostate.edu is included in that one as well?  I think I've received this one before so it's probably ok...Jerry.

 

I added ro_aries_security@mail.colostate.edu to the Email that runs on the 10th of each month. The quarterly Email has this address also, and I also remember receiving this one, so I think it is okay. I deleted AREGDYWL for today per Jerry Becker.

David.

 

 

 

Aborted Module Name:   KFSXCS52.KFSXS007_01

  Date:        Day:      Time:          Resolution:

01/20/11     Thu        00:26           See note from Janice & Kevin below.

 

Error log and follow up comments:

 

Reactivate Employee: Williamson,Mathew Thomas        49167   mtw              Student Hourly Employee          Employee

Reactivate Employee: Willson,Kendra Dawn      39767   kdsaine          Graduate Assistant        Employee

**ERROR update prncpl_id=42394 dwilson ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

Reactivate Employee: Wilson,Grace-Lyn Liberato       31600   wilikona         Administrative Professional   Employee

Reactivate Employee: Wolf-Ringwall,Amber Lee         10868   awolf51          Student Hourly Employee          Employee

+------------------------------------------------------------+

| EMPLOYEE INFORMATION JOB SUMMARY                     |

+------------------------------------------------------------+

+ [ -f login.2453894 ]

+ rm login.2453894

+ print *** \n*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE

+ FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_0

+ 1> 1_20_0026_sql_followup

+ egrep -v -f /ais01/dat/misc/prod/errstrg_sql_ORA_ok

+ 1>> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_

+ 1>> 01_20_0026_sql_followup

+ egrep -f /ais01/dat/misc/prod/errstrg_sql

+ /appworx/out/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_01_20_0026.A

+ WPROD.LOG print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS

+ \n***

+ 1>> /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_

+ 1>> 01_20_0026_sql_followup

+ cat

+ /ais01/dat/work/prod/KFSXCS52.KFSXS007_01.5662687.5662689.00.2011_01_2

+ 0_0026_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

**ERROR update prncpl_id=42394 dwilson ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

***

 

I will have BFS update the person manually.   This chain can be completed.

Kevin.

 

To proceed with the chain, delete the failed KFSXCS52.KFSXS007_01 component.

Janice.

 

 

Aborted Module Name:   HRMSS241.HRMSS241_01

  Date:        Day:      Time:          Resolution:

01/21/11     Fri          20:32           See follow up below.

 

Error log and follow up comments:

 

 

20:32:35 792    dbms_output.put_line(' ');

20:32:35 793    dbms_output.put_line('ERROR - Carrier ID and/or Coverage code could not be identified for plan '||v_plan||' option '||v_option);

20:32:35 794

 

20:32:35 808    dbms_output.put_line('ERROR - An unexpected error has occurred.');

20:32:35 809    dbms_output.put_line(sqlerrm);

20:32:35 810    dbms_output.put_line('All changes have been rolled back. Fix problem and run the process again before transmitting files to the vendor

 

ERROR - Carrier ID and/or Coverage code could not be identified for plan Green

 

20:32:35 793    dbms_output.put_line('ERROR - Carrier ID and/or Coverage code could not be identified for plan '||v_plan||' option '||v_option);

20:32:35 794    dbms_output.put_line('All changes have been rolled back. Fix problem and run the process again before transmitting files to the vendor.');

20:32:35 795 

20:32:35 796    raise;

 

ERROR at line 1:

ORA-06510: PL/SQL: unhandled user-defined exception

ORA-06512: at line 796

 

This problem has been taken care of.  Do not restart it.  When the process runs next week it will pick up any missed transactions.

Stevie G.

 

So, the plan is to not send any files from this chain until next week then?

Janice.

 

Delete the module.  We will not be sending them files this week.

Stevie G.

 

If we are not to send them files this week, then we need to delete the chain.  If we only delete the module, the remainder of the components in the chain will run, which would send files to the vendor.

Janice.

 

 

 

 

Aborted Module Name:   AREGDYTR.SQLLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

01/24/11     Mon       07:19           See follow up below.

 

Error log and follow up comments:

 

value used for ROWS parameter changed from 64 to 9 Record 50: Rejected - Error on table CSUBAN.SWLTNSC, column REQ_SSN.

ORA-12899: value too large for column "CSUBAN"."SWLTNSC"."REQ_SSN" (actual: 21, maximum: 9)

Table CSUBAN.SWLTNSC:

  160 Rows successfully loaded.

  1 Row not loaded due to data errors.

  0 Rows not loaded because all WHEN clauses were failed.

  0 Rows not loaded because all fields were null.

Space allocated for bind array:                 239166 bytes(9 rows)

Read   buffer bytes: 1048576

Total logical records skipped:          0

Total logical records read:           161

Total logical records rejected:         1

Total logical records discarded:        0

Run began on Mon Jan 24 07:19:23 2011

Run ended on Mon Jan 24 07:19:24 2011

Elapsed time was:     00:00:01.56

CPU time was:         00:00:00.04

 

 

Please restart AREGDYTR.SQLLOAD-LOOP_01, which is currently in LOADFAIL status.  The data file has been corrected by removing special characters.  I don’t know if there’s an easy way for IT Scheduling to remember this – but it would be helpful to copy Josh on AREGDYTR chain failure messages since he is one of the “experts” for Transcripts processing and often resolves the problem.  He is not in the AGEN AREG alert list though (and probably shouldn’t be), so not sure how that should be handled.

Janice.

 

 

 

 

Aborted Module Name:  EIDSUPDT.HRMSS111_01

 

  Date:        Day:      Time:          Resolution:

01/31/11     Mon       22:45           Restarted by ITS.

02/01/11     Tue        08:23           Restarted by ITS.

02/01/11     Tue        23:01           Restarted by ITS.

 

Error log and follow up comments:

 

01/31/11.

   

829492000 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

The following person has duplicate eID email addresses.  I just sent an email to Randy Miotke asking if one of the "duplicate" records could be removed.

 

focklera      1218514

823677933     allison.fockler@rams.colostate.edu;

Vicki.

 

It appears that other than for the problem person, HRMSS111 was successful, updating 32 records (see HRMSS111 JOB SUMMARY output below).  It was in our search of the sql output that the "ORA-" message was detected, thereby forcing the component to fail. Would it be okay to just delete the failed HRMSS111 for today or do you want to rerun it after Randy Miotke deals with the duplicate?

Janice.

 

The data has been fixed.  Please restart HRMSS111.

Vicki.

 

02/01/11 @ 08:23. & @ 23:01 (with same error as above).

 

Data has been fixed. Please start EIDSUPDT.HRMSS111_01 again

Rami.

 

CSU ID 829312674 Hayley Templeton-Norris

appears to have 2 primary eIDs - hayely, hayleytn and therefore this person has 2 primary email addresses that are we are trying to assign to one employee.

Randy, Can you please take a look at this and let us know what should be done about it? Vikki.

 

I called Sue Coulson and found that this person’s record state was the result of a merge in Admissions. I’ve the demoted the secondary eID “hayely” to a secondary. So the job should now be able to run.

Joe, I need to change the First.Last alias to reflect Hayley’s correct last name. It’s currently Hayley.Templeton-norris@rams.colostate.edu and should reflect the modified (correct) last name Hayley.Templeton@rams.colostate.edu.  You may want to have someone notify Haley.

Also, if a record does need to be merged, please have the processor check to be sure that two eIDs don’t exist for the person. If they do, I need to do some work on my side so this condition is avoided in the eID data.  Let me know if there are questions.  Randy.

 

We can restart HRMSS111 in EIDSUPDT.

Vicki.

 

 

 

 

Aborted Module Name:   KFSXCS52.KFSXS007_01

  Date:        Day:      Time:          Resolution:

02/01/11     Tue        00:12           See note from Kevin below.

08/11/11     Thu        03:06           Deleted by Dermot.

 

Error log and follow up comments:

 

02/01/11.

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

**ERROR update prncpl_id=4677 swaps ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

*** Error: 29540 ORA-00001: unique constraint (KFSUSER.KRIM_PRNCPL_TC1) violated

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

I will relay the errors (2) on to BFS.  This chain can be completed/deleted.

Kevin.

 

08/11/11.

ORA-20000: ORU-10027: buffer overflow, limit of 1000000 bytes

ORA-06512: at "SYS.DBMS_OUTPUT", line 32

ORA-06512: at "SYS.DBMS_OUTPUT", line 97

ORA-06512: at "SYS.DBMS_OUTPUT", line 112

 

The buffer overflowed, it appears that there were several people (12000 +) that were set the be inactivated.

I want to make sure that is correct before we restart, that seems very excessive.

I will be looking at this and will let you know when we have more details.

Josh.

 

 

 

Aborted Module Name:   FAIDTSWF_TUITION_SCHOLR_WKFLO

  Date:        Day:      Time:          Resolution:

02/03/11     Thu        09:46           See follow up below.

 

 

Error log and follow up comments:

 

“Empty value for prompt not allowed”.

 

I tried to request the FAIDTSWF chain in for Candy and I got this message (see attached).

Please advise.

David fixed the prompt on this chain and I ran it. (Thanks DavidJ)

I called Candy Chapman to verify that it is OK and she was out. I left a message.

Joleen.

 

The prompt value was empty, David inserted the value below into the prompt value field and the chain finished successfully.

Dermot.

 

{#FAID_AID_YEAR}

 

 

 

 

Aborted Module Name:   ADMSSCOR.VPLUS_RCAP-LOOP_01

  Date:        Day:      Time:          Resolution:

02/07/11     Mon       06:04           Restarted by ITS.

03/05/13     Tue        06:02           See note from Elden below.

 

Error log and follow up comments:

 

02/07/11.

+ 1> /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/ADMSSCOR.VPLUS_RCAP-LOOP_01.5747767.5747774.00.2011_02_07

+ _0604.AWPROD.LOG rm -ef

+ /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

rm: Removing /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ grep FTP_

+ print ADMSSCOR.VPLUS_RCAP-LOOP_01

+ rm -ef /ais01/dat/work/prod/ADMSSCOR.VPLUS_RCAP-LOOP_01_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

 

Failed because VistaPlus Prod is not up.

Greg/Rich have indicated that we need a DBA to fix the above issue with the databases.

Rich just informed me that VistaPlus Prod is back up – IT Scheduling may restart all the failed VPLUS_RCAP-LOOP components

Janice.

 

03/05/13.

+ awexe upd_var_value subvar=#vplus_rcap_iterations_10000457 var_value= flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

 

It looks like the /appworx/csu/exec/VPLUS_RCAP-LOOP.KSH script is inconsistently using iteration_count and iteration_cnt – it may be that the upgrade now does not allow an empty var_value?

Elden.

 

 

 

 

Aborted Module Name:   KFSXAPEI.VPLUS_RCAP-LOOP_01

 

  Date:        Day:      Time:          Resolution:

03/05/13     Tue        05:32           See follow up below.

 

Error log and follow up comments:

 

+ cat /ais01/spool/vplus/out/KFSXAPEI.10000382.txt

+ 1> /ais01/spool/vplus/out/KFSXAPEI.txt.BKP

+ [[ N = Y ]]

+ awexe upd_var_value subvar=#vplus_rcap_iterations_10000382 var_value= flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

 

It looks like the /appworx/csu/exec/VPLUS_RCAP-LOOP.KSH script is inconsistently using iteration_count and iteration_cnt – it may be that the upgrade now does not allow an empty var_value?

 

The actual error may be due to "#vplus_rcap_iterations_10000382" being longer than allowed – it's 31 characters long including the leading "#".  Our chain ID's have just gone over the 10000000 mark.  Maybe we can shorten the vplus_rcap_iterations_... to vplus_rcap_iter_ ?

 

I have a modified version of the script I will test shortly.

Elden.

 

 

 

Aborted Module Name:   HRMSWKYD.HRMSS166_01 

  Date:        Day:      Time:          Resolution:

02/07/11     Mon       10:03           Restarted by ITS.

 

Error log and follow up comments:

 

 

10:03:53  44          ,'Yes'                                    login_found_banr

10:03:53  45                ,account_status                          banr_account_status

10:03:53  46         from jobprd.csu_all_users@banprod

10:03:53  47               where account_status in ('OPEN','EXPIRED')

10:03:53  48         ) banr,

 

It appears that there was a problem with a database link.  Go ahead and restart the job.

Steve. H.

 

I reset HRMSWKYD.HRMSS166_01 and it Aborted with the following error:

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 46:

ORA-04052: error occurred when looking up remote object

ORA-00604: error occurred at recursive SQL level 1

ORA-12519: TNS:no appropriate service handler found

Robin.

 

HRMSWKYD.HRMSS166_01 is failing with the "ORA-12519: TNS:no appropriate service handler found" error message which Mark mentioned in last night's news file. HRMSS166 runs with login of jobprd@hrprod.

Janice.

 

Should be fixed now.

Mark B.

 

 

 

Aborted Module Name:   AGENWYMD.AGENS021_01

  Date:        Day:      Time:          Resolution:

02/08/11     Tue        00:25           Restarted by ITS.

 

Error log and follow up comments:

 

 

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 256

 

00:05:35 255      begin <<get_address>>

00:05:35 256          select substr(spraddr_street_line1,1,10), substr(spraddr_zip,1,5), spraddr_city

00:05:35 257          into v_address_line, v_zip, v_city

 

We have a problem in AGENS021 – The Weekly GP Edits.  SPRADDR_CITY is bigger than the v_city variable (20 characters).  We need to modify the program appropriately and will let you know when to try again.

Vicki.

 

AGENS021.sql has been updated and copied to temp KEBLER.

Rami.

 

Please restart AGENWYMD.AGENS021_01.

Vicki.

 

 

 

 

Aborted Module Name:   AROSBURS_FT.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

02/09/11     Wed        05:14          Restarted by ITS.

 

Error log and follow up comments:

 

# > ssh: connect to host bfsapp1.acns.colostate.edu port 22: A remote host did not respond within the timeout period.

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

 

Please contact the Bursar's Office (BFS_Bursar@mail.colostate.edu) to determine when their server, bfsapp1.acns.colostate.edu, will be available.

Janice.

 

It appears that bfsapp1.acns.colostate.edu port 22 is unavailable, can you please let us know when this is back up and we can restart AROSBURS_FT_TRANSFER_TO_BURSAR which failed earlier.

Dermot.

 

The bfsapp1 server crashed last night.  We are in the process of repairing it, but it may take a while.

 

The bfsapp1 server is back up and running.  You should be able to run the AROSBURS_FT_TRANSFER_TO_BURSAR process now.

Mike G.

 

Does IT Scheduling want to add some Chain Abort notes to the AROSBURS_FT_TRANSFER_TO_BURSAR chain documenting the "contact Bursar's office to determine when their server, bfsapp1.acns.colostate.edu, will be available" action to be taken if the SSH_SFTP failure is related to an unresponsive remote host?

Example of error message-

ssh: connect to host bfsapp1.acns.colostate.edu port 22: A remote host did not respond within the timeout period.

Similar Chain Abort notes could also be added to:

HRMSBURS_FT_TRANSFER_TO_BURSAR

KFSXBURS_FT_TRANSFER_TO_BURSAR

Of course, if the SSH_SFTP failure is unrelated to remote host connection problems, then the Bursar's office would not be the first point of contact for troubleshooting.

Thanks,

Janice.

 

AROS Abort notes updated by Dermot, KFSX by Robin & HRMS by James.

 

 

 

 

 

Aborted Module Name:   HRMSWKSP_01.HRMSS018_01

  Date:        Day:      Time:          Resolution:

02/14/11     Mon       22:17          See follow up below.

 

Error log and follow up comments:

ORA-20006: Error getting session dates ORA-20003: Error getting WORK_STUDY_AY_END_DATE ORA-01403: no data found

ORA-06512: at line 537

22:17:46 535    dbms_output.put_line(sqlerrm);

22:17:46 536    rollback;

22:17:46 537    raise;

22:17:46 538  END;

22:17:46 539

Grantham,Justin Daniel              10767901    26-Feb-2010 201010 MWSA 30.51

ORA-20006: Error getting session dates ORA-20003: Error getting WORK_STUDY_AY_END_DATE ORA-01403: no data found declare

 

The error occurs on "Kinsell,Heidi M" when it's going to find her fall end date:

      select to_date(substr(global_value, 1, 10),'YYYY/MM/DD')

      into   p_fall_end_date

      from   ff_globals_f

      where  global_name = 'WORK_STUDY_FLL_END_DATE'

      and    p_current_date between effective_start_date and effective_end_date;

The problem is that the p_current_date that is being passed in is '01-APR-2005'.

The only records that exists in p_fall_end_date where global_name = 'WORK_STUDY_FLL_END_DATE' have the dates of:

       22-DEC-06

       21-DEC-07

       19-DEC-08

       18-DEC-09

       17-DEC-10

The question is how to fix this problem?.............-Bob-

 

I just talked with the HR team and the decision is that additional follow-up will need to take place tomorrow for this failure.  Therefore, to allow last night's HRMSAW99 to complete, I added the HRMSWKSP chain prefix to the /ais01/dat/work/prod/HRMSAW99.WAIT_FOR_CHAINS_01.DAT "exclusions" file.  This will allow HRMSAW99 to complete the next time it wakes up (in just a few minutes) -- thereby creating the notify file which HRMSAW01 requires in order to allow the tonight's HRMS nightly schedule to proceed.

HRMSWKSP_01.HRMSS018_01 will remain in ABORTED status until tomorrow's follow-up has occurred…..Janice.

Oh.. I see that HRMSAW00 staging was not delayed so now the HRMSAW99 from yesterday is detecting the HRMS chains which were staged in.  IT Scheduling will need to update 02/14 to #HRMSAW99_EXCLUDE_DATE subvar ASAP to allow HRMSAW99 to complete.

Janice.

 

I have researched this problem and it doesn't seem to make any sense.  The error message is ORA-12899: value too large for column "PSP"."PSP_ENC_LINES"."CREATED_BY" but I reviewed the script and there are no updates to the created_by column on the psp_enc_lines table.  I also tested the script and it ran successfully without any errors.  I reviewed the script notes and it doesn't appear that data has changed since the procedure is simply to run the entire chain again.

We may want to run this chain sometime tomorrow during the day and see if it completes successfully.  If it does then tomorrow night's run should be successful.  If it fails again then we can take a look at it

Steve H.

 

Not sure if we're close on a solution or not - but since it is so late already, IT Scheduling should go ahead and add HRMSWKSP chain prefix to the /ais01/dat/work/prod/HRMSAW99.WAIT_FOR_CHAINS_01.DAT "exclusions" file again like we did yesterday to allow HRMSAW99 to complete. Steve Hill just informed me that he has created a temp HRMSS018.sql to solve the HRMSWKSP problem. Please restart the aborted HRMSWKSP_01.HRMSS018_01 component ASAP.

Janice.

 

 

 

 

Aborted Module Name:   AREGDYAD.WEB_API_01

 

  Date:        Day:      Time:          Resolution:

02/15/11     Tue        02:54          Restarted by Janice.

08/14/13     Wed       07:16          Restarted by Joleen.

 

Error log and follow up comments:

 

02/15/11.

# 20110215-025502 : *** FATAL ***   | Base_API::_exit < ScriptInterface::_exit : stack

# 20110215-025502 : *** FATAL ***   | [00] [/appworx/csu/exec/Base_API.pm:000788]   Base_API::call_stack           #=1 @=1 < Base_API

# 20110215-025502 : *** FATAL ***   | [01] [/appworx/csu/exec/WEB_API.PL:000198]    Base_API::_exit                #=1 @=0 < ScriptInterface

# 20110215-025502 : *** FATAL ***   | [02] [/appworx/csu/exec/WEB_API.PL:001281]    ScriptInterface::_exit         #=1 @=0 < ScriptInterface

# 20110215-025502 : *** FATAL ***   | [03] [/appworx/csu/exec/WEB_API.PL:001590]    ScriptInterface::post_file     #=1 @=0 < Main

# 20110215-025502 : *** FATAL ***   | [04] [/appworx/csu/exec/WEB_API.PL:001461]    Main::main                     #=1 @=0 < Main

# 20110215-025502 : *** FATAL ***   | ****************************************************************************************************

# 20110215-025502 : *** FATAL ***   | Post Error

# 20110215-025502 : *** FATAL ***   | ****************************************************************************************************

 

Error was:

# 20110216-034811 : [UAResponse]    | 500 Can't connect to apps.gradesfirst.com:443 (Bad hostname 'apps.gradesfirst.com')

Changed "apps.gradesfirst.com" to "app.gradesfirst.com " and restarted - completed successfully.

There was a problem with the URL prompt value - I correct this value and restarted - completed successfully.

Like change has been made to the chain definition.

Janice.

 

08/14/13.

SSL negotiation failed:  at /usr/opt/perl5/lib/site_perl/5.8.8/LWP/Protocol/http.pm line 31

 

I tried restarting and got the same error.

 

AREGDYAD is the Daily Athletic Extract.  This is sending data to Grades First, I believe.

I would suggest trying the WEB_API again and if it does not work, then let's talk with Elden.

 

I called GradesFirst this morning (800)745-5180 They said that they had changed their SSL Certificate and that is what is causing us to not be able to send our updated data to them.

They are going to send an email with instructions.  We will follow up after we receive that information.

Vicki.

 

 

 

Aborted Module Name:   HRMSENCD.HRMSS074_01

  Date:        Day:      Time:          Resolution:

02/15/11     Tue        20:47          See note from David below.

 

Error log and follow up comments:

 

 

HRMSENCD.HRMSS074_01 failed with the below error:

20:47:40  82        update psp_enc_lines pel

20:47:40  83          set encumbrance_amount = 0

20:47:40  84          where ENC_LINE_ID = X.ENC_LINE_ID;

20:47:40  85      v_count1 := v_count1 + 1;

20:47:40  86   Else

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE

FOLLOWING:

***

ERROR at line 1:

ORA-12899: value too large for

column "PSP"."PSP_ENC_LINES"."CREATED_BY"

ORA-06512: at line 82

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

 

I contacted Steve Hill, who instructed me to skip the HRMSENCD chain for tonight. I followed the below instructions from the Chain Notes to skip HRMSENCD:

*****NOTE: If it is determined that the the problem will not be fixed - i.e. HRMSENCD_DAILY_ENCUMBRANCES chain will be skipped and follow-up will be done the next day,  then proceed as follows:

1)  Delete the HRMSENCD_DAILY_ENCUMBRANCES chain from backlog.

2)  Delete the HRMSAW14_ENCUMBRANCES_DONE chain from backlog.

This will allow downstream dependent chain (such as rest of HR update schedule to proceed), but skip the HRMSAW14 chain components which would have created notify files for KFSX encumbrance related processing.  Consequently, KFSX related encumbrance processes will be skipped

David.

 

 

 

 

Aborted Module Name:   KFSXSYPG.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

02/16/11     Wed        19:01          See follow up below.

 

Error log and follow up comments:

 

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

 

Several KFSX jobs have failed with "ERROR: Malta SCRIPT ABORTED - EXIT CODE=1". I contacted Kevin who will login to check the scripts. He also suggested calling Shawn to check out Malta. Shawn is also looking.

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

 

Last night's problem with KFSX Java not only caused a stall/delay in the nightly KFSX production schedule, but many of the nightly processes actually did not run due to the automated cancel and proceed feature (CHAIN_CANCEL) within these chains (see below).  While this automated feature is designed for the occasionally specific failure within these chains, the unfortunate side-effect is that when a global process (such as the kfsa_java script) is malfunctioning then **many** nightly batch processes are in essence never performed.  Consequently, due to the potential adverse impact, it is extremely important that any changes to global scripts are adequately tested. 

We were successfully running KFS java programs via Applications Manager during the day yesterday, with the last successful execution of a KFS java program being the Clear Cache program at 16:02.  The next java program, which ran around 19:00 failed immediately after executing the "unset CATALINA_OPTS" statement (within the /app/env/kfsprd_common.env script?).  Between 19:00 and 19:11, nine java programs in nine different KFSX chains failed with the same error.  Of those, the five chains listed below (KFSXAPAL, KFSXAPAP, KFSXAPRP, KFSXPDCA, KFSXPDGL) contained the automated cancel - so those processes did **not** run last night - i.e. they were not restarted by David because they were not in backlog.  Only the failed java programs within KFSXPDFR, KFSXSYPG, KFSXCGCF, KFSXFPPC were restarted. Do we know what caused the problem?   Janice.

 

I added the “unset CATALINA_OPTS” command to the kfstrng_common.env script yesterday as part of the work I am doing to diagnose KFS/Tomcat stoppage issues. I need that to occur between a startup and a shutdown. Before making the change, I checked the kfsprd_appworx.env script. The “appworx” script executes the “common” script and then sets CATALINA_OPTS. So, the appworx script essentially unsets CATALINA_OPTS before resetting another value.

What I suspect was happening is that the KFS jobs in AppWorx are set up to not only run the appworx script, but then follows by running the common script. Would you verify whether this is the case or not? If it is, it should not be necessary for the jobs to explicitly execute the common script as the appworx script already handles that. If it isn’t, we will need to look further.  Shawn.

 

The Applications Manager KFSX_JAVA script has not changed since 4/20/2010 – it performs an SSH to the host machine (in this case, Malta2) and specifies the host command (in this case, the kfsx_java_ssh.ksh script) to run on the host machine.  Additionally parameters are passed into the kfsx_java_ssh script, such as the java program name to be execute, the “prd” instance qualifier, etc.  Again, the kfsx_java_ssh.ksh script has not changed  since 04/20/2010.   Within the  kfsx_java_ssh.ksh script, the following command is executed:

. /app/env/kfs${batch_service_env}_appworx.env  (in this case, ${batch_service_env} would evaluate to prd)

I don’t have a login to Malta (or if I do, I don’t remember the password) – but based on the echoing from our joblogs I’m guessing that it is within the kfsxprd_appworx.env script (which I believe is maintained by you/DBA’s) that the “. /app/env/kfsprd_common.env” statement would located.  <</ais02/job/prod/kshexe_ssh.74>> kfsx_java_ssh.ksh prd RunBatch pcardNotificati

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta                                                                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd                 

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch                

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]             

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                       

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                          

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env      à This is in kfsx_java_ssh.ksh   

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_common.env       à This is NOT in kfsx_java_ssh.ksh, so must be within kfsprd_appworx.env The statement immediately following the “. /app/env/kfsprd_common.env” statement in the kfsx_java_ssh.ksh script is:

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

Which we actually never got to due to the failure on the “unset CATALINA_OPTS” command – which has be invoked somewhere directly (or indirectly) via the  “. /app/env/kfsprd_common.env” statement . 

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

 

 

 

 

Aborted Module Name:   KFSXPDFR.KFSX_JAVA_02

 

  Date:        Day:      Time:          Resolution:

02/16/11     Wed        19:06          See follow up below.

 

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

 

Several KFSX jobs have failed with "ERROR: Malta SCRIPT ABORTED - EXIT CODE=1". I contacted Kevin who will login to check the scripts. He also suggested calling Shawn to check out Malta. Shawn is also looking.

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

 

Last night's problem with KFSX Java not only caused a stall/delay in the nightly KFSX production schedule, but many of the nightly processes actually did not run due to the automated cancel and proceed feature (CHAIN_CANCEL) within these chains (see below).  While this automated feature is designed for the occasionally specific failure within these chains, the unfortunate side-effect is that when a global process (such as the kfsa_java script) is malfunctioning then **many** nightly batch processes are in essence never performed.  Consequently, due to the potential adverse impact, it is extremely important that any changes to global scripts are adequately tested. 

We were successfully running KFS java programs via Applications Manager during the day yesterday, with the last successful execution of a KFS java program being the Clear Cache program at 16:02.  The next java program, which ran around 19:00 failed immediately after executing the "unset CATALINA_OPTS" statement (within the /app/env/kfsprd_common.env script?).  Between 19:00 and 19:11, nine java programs in nine different KFSX chains failed with the same error.  Of those, the five chains listed below (KFSXAPAL, KFSXAPAP, KFSXAPRP, KFSXPDCA, KFSXPDGL) contained the automated cancel - so those processes did **not** run last night - i.e. they were not restarted by David because they were not in backlog.  Only the failed java programs within KFSXPDFR, KFSXSYPG, KFSXCGCF, KFSXFPPC were restarted. Do we know what caused the problem?   Janice.

 

I added the “unset CATALINA_OPTS” command to the kfstrng_common.env script yesterday as part of the work I am doing to diagnose KFS/Tomcat stoppage issues. I need that to occur between a startup and a shutdown. Before making the change, I checked the kfsprd_appworx.env script. The “appworx” script executes the “common” script and then sets CATALINA_OPTS. So, the appworx script essentially unsets CATALINA_OPTS before resetting another value.

What I suspect was happening is that the KFS jobs in AppWorx are set up to not only run the appworx script, but then follows by running the common script. Would you verify whether this is the case or not? If it is, it should not be necessary for the jobs to explicitly execute the common script as the appworx script already handles that. If it isn’t, we will need to look further.  Shawn.

 

The Applications Manager KFSX_JAVA script has not changed since 4/20/2010 – it performs an SSH to the host machine (in this case, Malta2) and specifies the host command (in this case, the kfsx_java_ssh.ksh script) to run on the host machine.  Additionally parameters are passed into the kfsx_java_ssh script, such as the java program name to be execute, the “prd” instance qualifier, etc.  Again, the kfsx_java_ssh.ksh script has not changed  since 04/20/2010.   Within the  kfsx_java_ssh.ksh script, the following command is executed:

. /app/env/kfs${batch_service_env}_appworx.env  (in this case, ${batch_service_env} would evaluate to prd)

I don’t have a login to Malta (or if I do, I don’t remember the password) – but based on the echoing from our joblogs I’m guessing that it is within the kfsxprd_appworx.env script (which I believe is maintained by you/DBA’s) that the “. /app/env/kfsprd_common.env” statement would located.  <</ais02/job/prod/kshexe_ssh.74>> kfsx_java_ssh.ksh prd RunBatch pcardNotificati

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta                                                                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd                 

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch                

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]             

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                       

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                          

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env      à This is in kfsx_java_ssh.ksh   

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_common.env       à This is NOT in kfsx_java_ssh.ksh, so must be within kfsprd_appworx.env The statement immediately following the “. /app/env/kfsprd_common.env” statement in the kfsx_java_ssh.ksh script is:

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

Which we actually never got to due to the failure on the “unset CATALINA_OPTS” command – which has be invoked somewhere directly (or indirectly) via the  “. /app/env/kfsprd_common.env” statement . 

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

 

 

 

 

Aborted Module Name:   KFSXFPPC.KFSX_JAVA_03

  Date:        Day:      Time:          Resolution:

02/16/11     Wed        19:01          See follow up below.

 

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

 

Several KFSX jobs have failed with "ERROR: Malta SCRIPT ABORTED - EXIT CODE=1". I contacted Kevin who will login to check the scripts. He also suggested calling Shawn to check out Malta. Shawn is also looking.

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

 

Last night's problem with KFSX Java not only caused a stall/delay in the nightly KFSX production schedule, but many of the nightly processes actually did not run due to the automated cancel and proceed feature (CHAIN_CANCEL) within these chains (see below).  While this automated feature is designed for the occasionally specific failure within these chains, the unfortunate side-effect is that when a global process (such as the kfsa_java script) is malfunctioning then **many** nightly batch processes are in essence never performed.  Consequently, due to the potential adverse impact, it is extremely important that any changes to global scripts are adequately tested. 

We were successfully running KFS java programs via Applications Manager during the day yesterday, with the last successful execution of a KFS java program being the Clear Cache program at 16:02.  The next java program, which ran around 19:00 failed immediately after executing the "unset CATALINA_OPTS" statement (within the /app/env/kfsprd_common.env script?).  Between 19:00 and 19:11, nine java programs in nine different KFSX chains failed with the same error.  Of those, the five chains listed below (KFSXAPAL, KFSXAPAP, KFSXAPRP, KFSXPDCA, KFSXPDGL) contained the automated cancel - so those processes did **not** run last night - i.e. they were not restarted by David because they were not in backlog.  Only the failed java programs within KFSXPDFR, KFSXSYPG, KFSXCGCF, KFSXFPPC were restarted. Do we know what caused the problem?   Janice.

 

I added the “unset CATALINA_OPTS” command to the kfstrng_common.env script yesterday as part of the work I am doing to diagnose KFS/Tomcat stoppage issues. I need that to occur between a startup and a shutdown. Before making the change, I checked the kfsprd_appworx.env script. The “appworx” script executes the “common” script and then sets CATALINA_OPTS. So, the appworx script essentially unsets CATALINA_OPTS before resetting another value.

What I suspect was happening is that the KFS jobs in AppWorx are set up to not only run the appworx script, but then follows by running the common script. Would you verify whether this is the case or not? If it is, it should not be necessary for the jobs to explicitly execute the common script as the appworx script already handles that. If it isn’t, we will need to look further.  Shawn.

 

The Applications Manager KFSX_JAVA script has not changed since 4/20/2010 – it performs an SSH to the host machine (in this case, Malta2) and specifies the host command (in this case, the kfsx_java_ssh.ksh script) to run on the host machine.  Additionally parameters are passed into the kfsx_java_ssh script, such as the java program name to be execute, the “prd” instance qualifier, etc.  Again, the kfsx_java_ssh.ksh script has not changed  since 04/20/2010.   Within the  kfsx_java_ssh.ksh script, the following command is executed:

. /app/env/kfs${batch_service_env}_appworx.env  (in this case, ${batch_service_env} would evaluate to prd)

I don’t have a login to Malta (or if I do, I don’t remember the password) – but based on the echoing from our joblogs I’m guessing that it is within the kfsxprd_appworx.env script (which I believe is maintained by you/DBA’s) that the “. /app/env/kfsprd_common.env” statement would located.  <</ais02/job/prod/kshexe_ssh.74>> kfsx_java_ssh.ksh prd RunBatch pcardNotificati

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta                                                                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd                 

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch                

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]             

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                       

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                          

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env      à This is in kfsx_java_ssh.ksh   

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_common.env       à This is NOT in kfsx_java_ssh.ksh, so must be within kfsprd_appworx.env The statement immediately following the “. /app/env/kfsprd_common.env” statement in the kfsx_java_ssh.ksh script is:

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

Which we actually never got to due to the failure on the “unset CATALINA_OPTS” command – which has be invoked somewhere directly (or indirectly) via the  “. /app/env/kfsprd_common.env” statement . 

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

 

 

 

Aborted Module Name:   KFSXCGCF.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

02/16/11     Wed        19:01          See follow up below.

 

Error log and follow up comments:

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by

+ MESSAGES TO STD OUTPUT \n***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

 

Several KFSX jobs have failed with "ERROR: Malta SCRIPT ABORTED - EXIT CODE=1". I contacted Kevin who will login to check the scripts. He also suggested calling Shawn to check out Malta. Shawn is also looking.

Shawn modified the script that runs the JAVA. I have re- started all the failed KFSX JAVA jobs.

David.

 

Last night's problem with KFSX Java not only caused a stall/delay in the nightly KFSX production schedule, but many of the nightly processes actually did not run due to the automated cancel and proceed feature (CHAIN_CANCEL) within these chains (see below).  While this automated feature is designed for the occasionally specific failure within these chains, the unfortunate side-effect is that when a global process (such as the kfsa_java script) is malfunctioning then **many** nightly batch processes are in essence never performed.  Consequently, due to the potential adverse impact, it is extremely important that any changes to global scripts are adequately tested. 

We were successfully running KFS java programs via Applications Manager during the day yesterday, with the last successful execution of a KFS java program being the Clear Cache program at 16:02.  The next java program, which ran around 19:00 failed immediately after executing the "unset CATALINA_OPTS" statement (within the /app/env/kfsprd_common.env script?).  Between 19:00 and 19:11, nine java programs in nine different KFSX chains failed with the same error.  Of those, the five chains listed below (KFSXAPAL, KFSXAPAP, KFSXAPRP, KFSXPDCA, KFSXPDGL) contained the automated cancel - so those processes did **not** run last night - i.e. they were not restarted by David because they were not in backlog.  Only the failed java programs within KFSXPDFR, KFSXSYPG, KFSXCGCF, KFSXFPPC were restarted. Do we know what caused the problem?   Janice.

 

I added the “unset CATALINA_OPTS” command to the kfstrng_common.env script yesterday as part of the work I am doing to diagnose KFS/Tomcat stoppage issues. I need that to occur between a startup and a shutdown. Before making the change, I checked the kfsprd_appworx.env script. The “appworx” script executes the “common” script and then sets CATALINA_OPTS. So, the appworx script essentially unsets CATALINA_OPTS before resetting another value.

What I suspect was happening is that the KFS jobs in AppWorx are set up to not only run the appworx script, but then follows by running the common script. Would you verify whether this is the case or not? If it is, it should not be necessary for the jobs to explicitly execute the common script as the appworx script already handles that. If it isn’t, we will need to look further.  Shawn.

 

The Applications Manager KFSX_JAVA script has not changed since 4/20/2010 – it performs an SSH to the host machine (in this case, Malta2) and specifies the host command (in this case, the kfsx_java_ssh.ksh script) to run on the host machine.  Additionally parameters are passed into the kfsx_java_ssh script, such as the java program name to be execute, the “prd” instance qualifier, etc.  Again, the kfsx_java_ssh.ksh script has not changed  since 04/20/2010.   Within the  kfsx_java_ssh.ksh script, the following command is executed:

. /app/env/kfs${batch_service_env}_appworx.env  (in this case, ${batch_service_env} would evaluate to prd)

I don’t have a login to Malta (or if I do, I don’t remember the password) – but based on the echoing from our joblogs I’m guessing that it is within the kfsxprd_appworx.env script (which I believe is maintained by you/DBA’s) that the “. /app/env/kfsprd_common.env” statement would located.  <</ais02/job/prod/kshexe_ssh.74>> kfsx_java_ssh.ksh prd RunBatch pcardNotificati

<#/ais02/job/temp/kfsx_java_ssh.ksh.24#> hostname Malta                                                                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.31#> batch_service_env=prd                 

<#/ais02/job/temp/kfsx_java_ssh.ksh.32#> batch_service=RunBatch                

<#/ais02/job/temp/kfsx_java_ssh.ksh.33#> [[ RunBatch = RunBatch ]]             

<#/ais02/job/temp/kfsx_java_ssh.ksh.35#> (( 4 < 4 ))                           

<#/ais02/job/temp/kfsx_java_ssh.ksh.40#> export RunBatch_stepName=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                       

<#/ais02/job/temp/kfsx_java_ssh.ksh.41#> export RunBatch_jobName=KFSXFPPC.pcardNotificationStep.5795021.5795032.00

<#/ais02/job/temp/kfsx_java_ssh.ksh.42#> batch_service_parms=pcardNotificationStep KFSXFPPC.pcardNotificationStep.5795021.5795032.00                          

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_appworx.env      à This is in kfsx_java_ssh.ksh   

<#/ais02/job/temp/kfsx_java_ssh.ksh.62#> . /app/env/kfsprd_common.env       à This is NOT in kfsx_java_ssh.ksh, so must be within kfsprd_appworx.env The statement immediately following the “. /app/env/kfsprd_common.env” statement in the kfsx_java_ssh.ksh script is:

BATCH_DIR=${CATALINA_HOME}/webapps/kfs-${batch_service_env}/WEB-INF/classes

Which we actually never got to due to the failure on the “unset CATALINA_OPTS” command – which has be invoked somewhere directly (or indirectly) via the  “. /app/env/kfsprd_common.env” statement . 

I’m still confused – if you only changed kfstrng_common.env script, then how did that change cause the kfsprd scripts to fail?...Janice.

 

 

 

Aborted Module Name:   AGENDYGN.AGENS006_01

  Date:        Day:      Time:          Resolution:

02/16/11     Wed        20:01          Restarted by ITS.

 

Error log and follow up comments:

 

20:01:44 1781 

20:01:44 1782   gb_common.p_commit();

20:01:44 1783    utl_file.fclose(file_handle);

20:01:44 1784  exception

20:01:44 1785    when others then

20:01:44 1786      -- flush output needed for troubleshooting

20:01:44 1787      put_report_line('Error: ' || sqlerrm);

20:01:44 1788      utl_file.fflush(file_handle);

20:01:44 1789      utl_file.fclose(file_handle);

20:01:44 1790      raise; -- reraise the exception

20:01:44 1791  end;

 

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-20100: ::Hold from date must be less than or equal to hold to date::

ORA-06512: at line 1790

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

 

This person has an MA address with an end date of 01-JAN-2099

11284686    829338399   Rodriguez   Marcia      Stephanie

Beginning 09-FEB-11 and ending 01-JAN-99

AGENS006 is trying to add a hold and failing.  Can you please check with Jeanie Breiner (who last touched this record on 16-FEB-11, yesterday) and see if we can remove the end date that is out 88 years or if perhaps the end date should be something else? When the address record has been fixed, we can restart AGENDYGN.AGENS006_01

Vicki.

 

Just got back from a meeting.  Her address has been corrected.  Please restart the job below.  

Sue Coulson.

 

Thanks Sue for getting this person fixed.  Please restart AGENDYGN.AGENS006_01 that aborted.

Vicki.

 

 

 

 

 

 

Aborted Module Name:   HRMSS231.FTP_TO_SELMAN_01

  Date:        Day:      Time:          Resolution:

02/18/11     Fri          21:22          See note from Janice below.

Error log and follow up comments:

 

 

 

# FATAL : Unexpected error - undefined line returned

#------------------------------------------------------------------------------

# USAGE: /appworx/csu/exec/FTP_ENHANCED.PL \

#     remote_host=override_host_name\

#     transfer_mode=transfer_command\

#     translate=translation_mode\

#     src_file=fully_qualified_source_file\

#     dst_file=fully_qualified_destination_file\

#     site_options=comma_delimited_site_options\

#     local_options=semicolon_delimited_local_options\

#   transfer_mode values

#     append, dir, get, put, recv, send, submit

#   translate values:

#     ascii, binary, ebcdic

#   site_options values

#     comma delimited site options

#       RECFM=FBA,LRECL=133,BLKSIZE=3325

#   local_options values

#     semicolon delimited local options

#       active | passive | cd=remote_dir

#   Also, these environment variable must be set

#     net_connect, db_login, db_password

#------------------------------------------------------------------------------

# exit     : [ 2011.02.18-21:23:02 ] -- RETURN CODE = 100

 

I worked with Elden on this - seems that vendor requires SFTP - hence the FTP failure.  We deleted this chain, deactivated the FTP_TO_SELMAN component and reactivated the SSH_SFTP component in the HRMSS231_HARTFORD_PORT chain and then requested the chain in to run again.  SSH_SFTP successfully transferred the encrypted file to the vendor and the chain has successfully completed. 

David will need to follow-up with permanent removal the FTP_TO_SELMAN component and associated module/login/etc.

Janice.

 

 

 

Aborted Module Name:   OSYSJOBS_04.OSYSLLNK_01

  Date:        Day:      Time:          Resolution:

02/20/11     Sun        16:33          Restarted by ITS.

 

Error log and follow up comments:

 

*** PROCEED WITH EXECUTION OF SCRIPT: sys_llnk_rsh.ksh

***

<</ais02/job/prod/kshexe_rsh.70>> sys_llnk_rsh.ksh <#/ais02/job/prod/sys_llnk_rsh.ksh.23#> alias log=echo "*** " $(date +%m/%d/%Y-%T) <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_09.OSYSPURG_01.5814124.5814128.00_opmn_logs is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 1 Remote Shell errtrap_rsh parm 2 value is 1 <#errtrap_rsh.21#> [[ 1 > 0 ]] <#errtrap_rsh.21#> exit 1 <</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1 Remote Shell errtrap_rsh parm 2 value is 1 <<errtrap_rsh.3>> [[ 1 > 0 ]] <<errtrap_rsh.6>> print *** \n*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Kebler SCRIPT ABORTED - EXIT CODE=1

 

I'm forwarding this email which I sent back in November regarding OSYSLLNK failures.  IT Scheduling should review this email and proceed with troubleshooting today's OSYSJOBS_04.OSYSLLNK_01 as was outlined in this old email. Thanks, Janice

 

Subject: FW: APPWORX ABORT - OSYS

 

I'm forwarding this email which I sent early in November regarding a couple OSYSLLNK failures.  Please review this email for instructions on troubleshooting OSYSLLNK failures and as was indicated near the end of the email:

 

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation,....

 

Based on the previously communicated information, IT Scheduling could have just restarted the two failed OSYSLLNK components this morning because they both failed with similar 'The status on "filename here" is not valid' errors.

Janice.

 

 

 

 

Aborted Module Name:   FAIDINST_NW.LYNX_02

 

  Date:        Day:      Time:          Resolution:

02/22/11     Tue        21:07          See follow up below.

 

Error log and follow up comments:

 

 [100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog /appworx/out/FAIDINST_NW.LYNX_02.5823654.5823658.00.2011_02_22_2107.AWPROD.LOG

+ rm -ef /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

rm: Removing /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ grep FTP_

+ print FAIDINST_NW.LYNX_02

+ rm -ef /ais01/dat/work/prod/FAIDINST_NW.LYNX_02_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

 

Tom called.  He would like to restart this from the beginning.

Dawn.

 

Tom called. He would like to put a hold on his request below. He would like to talk to one of his developers first, then he will get back to us.

Joleen.

 

Let's go ahead and get ready for Tom's request by deleting the FAIDINST_SCHOLARSHIPS chain which is currently in backlog, then use the schedule procedure to schedule the FAIDINST_NW schedule with a start time in the future, but before 10:00 A.M.  Then run staging to bring FAIDINST_NW into backlog and be sure to place the chain on hold right away.  This will allow us get the chain scheduled on the previous virtual day, yet hold it until Tom is ready with his code changes.

 

When FAIDINST_SCHOLARSHIPS was staged in, the value provided for Prompt #4 (Hours ahead to be staged) was large enough that it also staged in tonight's FAIDINST_SCHOLARSHIPS (with 17:00 start time).  In the future, it would be best to provide the smallest value possible for "Hours ahead to be staged" to avoid staging in more than intended/necessary.  In the case with FAIDINST, last night's FAIDAW99 was detecting tonight's FAIDINST in backlog and therefore would not complete.  I updated #FAIDAW99_EXCLUDE_DATE with a value of 02/23 so that FAIDAW99 could complete.  

Janice.

 

 

 

 

Aborted Module Name:   FAIDINST_NW.LYNX_01

 

  Date:        Day:      Time:          Resolution:

02/23/11     Wed        21:06          See follow up below.

 

Error log and follow up comments:

 

URL=http://wsprod.colostate.edu/cwis231/onet/autorun/partnership_schols.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

Looks like we may have a repeat of the same problem we had yesterday - any suggestions?

 

Tom has requested that we run FAIDINST_SCHOLARSHIPS "from the top" again like we did yesterday. 

Therefore, IT Scheduling will need to:

1)  Delete the FAIDINST_SCHOLARSHIPS chain which is currently in backlog

2)  Using the schedule procedure, schedule the FAIDINST_NW schedule to run today with a start time in the future, but before 10:00 A.M. 

3)  Run staging (with a value of 1 for "Hours ahead to be staged" prompt value ) to bring FAIDINST_NW into backlog.  Once the "new" FAIDINST_SCHOLARSHIPS chain for FAIDINST_NW schedule has been staged into backlog, it may be released to run right away.

Janice.

 

 

 

 

 

Aborted Module Name:   AGENWYWP.AGENS004_01

  Date:        Day:      Time:          Resolution:

02/23/11     Wed        18:03          Restarted by ITS.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

ORA-06512: at line 633

 

18:03:07 629   --delete from csug_purge_ids where marked_flag = 'Y';

18:03:07 630    utl_file.fclose(file_handle);

18:03:07 631    utl_file.fclose(file_error);

18:03:07 632    utl_file.fclose(file_purge);

18:03:07 633    raise;

18:03:07 634  --

 

Where is the UTL file output that would tell us who was the last person processed?

Vicki.

 

CORRECT ERROR:                                                                       

-29422721    11296725               -Pirge Fulton-Beale    Davie    Nathaniel28-JAN-11

-29436203    11298660               Purge-KROPUENSKE    LEAH    ROSE         19-JAN-11

-29496592    11307296               Dubosson    Anne Sophie                  03-FEB-11

Persons not added to purge file: 3

 

Please restart this chain at AGENS004. I needed to see the UTL file output which Janice helped me find in /orautl/BANPROD/chain name...It identified the person in error, Sue Coulson fixed the problem and we should be able to restart.

Vicki.

 

 

 

 

Aborted Module Name:   ODBABKUP_BANDORA_DB

  Date:        Day:      Time:          Resolution:

02/24/11     Thu        16:30          Deleted by ITS.

 

Error log and follow up comments:

 

 

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog

+ /appworx/out/ODBABKUP_BANDORA_DB.5835293.5835293.00.2011_02_24_1630.AW

+ PROD.LOG rm -ef /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

rm: Removing /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

+ grep FTP_

+ print ODBABKUP_BANDORA_DB

+ rm -ef /ais01/dat/work/prod/ODBABKUP_BANDORA_DB_jobstat

+ [[ n = y ]]

+ [[ ABORTD = ABORTD ]]

 

Please cancel that chain, looks like KFSTRNG is still not quite right. I will fix it myself today and manually run the rest of the backups.

Mark B.

 

And just so everyone knows - this was Shawn's fault.

Shawn.

 

It's easy to forget that notify file...it's all good.

Mark B.

 

 

 

Aborted Module Name:   FAIDTRAK_EV.LYNX-01

  Date:        Day:      Time:          Resolution:

02/25/11     Fri          04:15          Restarted by David.

 

Error log and follow up comments:

 

 

URL=http://wsprod.colostate.edu/cwis231/autorun/spring_only_no_spring_aprd.cfm?ay=FAIDTRAK_EV (GET)

STATUS=HTTP/1.1 503 Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

 

I decided to re-start this and it finished okay. Unfortunately, the schedule is way behind.

Does the STATUS=HTTP/1.1 503 Server Error mean it had trouble connecting to the web site?

David.

 

 

 

Aborted Module Name:   AGENORGN.CHAIN_VPLUS_PS_01

 

  Date:        Day:      Time:          Resolution:

02/28/11     Mon       16:02          See follow up from Janice below.

 

Error log and follow up comments:

 

+ The file access permissions do not allow the specified action.

***

*** END SEARCH OF LAST JOBID(5850710.00) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS

***

 

I'm a bit confused/concerned regarding the course of events with the AGENORGN.CHAIN_VPLUS_PS_01 chain component abort. Although I don't see an email reporting it, this component aborted at 16:02 with the same error as listed below for the failure at 16:18.  However, it appears that the 16:02 aborted AGENORGN.CHAIN_VPLUS_PS_01 was restarted by IT Scheduling and it completed at 16:03.  The problem is that appropriate follow-up was NOT taken for this aborted component.  As the error message indicates, it failed due to errors in the AFTER conditions of the previous jobid (i.e. the AGENS002 component).  Review of the associated runhost log file for this chainid, /ais01/joblog/runhost_5850540_AWPROD.log, reveals that the error occurred when the AGENORGN.AGENS002_01 AFTER condition was trying to empty the user's data file but the group permissions on the file did not allow "write" access.  The chain component includes this condition to empty the input data file to ensure that the same user data file is not processed the next time the chain runs - this is an extremely important action to avoid duplicate processing of data.  Appropriate follow-up **must** always be taken to determine why an AFTER condition encountered an error and rectify the problem.  In this case, appropriate follow-up should have included changing the unix permissions to allow group write access and then manually emptying the file - the easiest way to manually accomplish this is to delete the file and then recreate it as empty. 

 

The chain was run again at 16:16 - apparently with the same user input data file. Of course, the  AGENORGN.CHAIN_VPLUS_PS_01 failed again with the same problem due to the AFTER condition's failure to be able to empty the user's data file.  Again the chain component was restarted - however, this time it appears that David manually performed the necessary follow-up to modify unix permissions and empty the /userfiles/Uareg/data/AGENORGN.AGENS002_01.DAT file.

 

Comments/questions?

Janice.

 

 

 

 

Aborted Module Name:  HRMSMGMT.HRMSR053_01

  Date:        Day:      Time:          Resolution:

03/02/11     Wed       06:30            See follow up from Janice below.

03/04/11     Fri          22:16            See follow up below.

 

Error log and follow up comments:

 

03/02/11.    

Craig called last night when he "killed" the long running HRMSR053.  After that, it was acting weird in Applications Manager in that the chain component status was in KILLING status - I thought after a while it might change to KILLED... but not yet!  I removed predecessors on the VPLUS_RCAPTURE so that we could go ahead and get the HRMSR001/1SS reports captured.  Then I changed the HRMSMGMT chain so it was not single run and requested another one in... deleted the HRMSR001/001SS reports which have already run so we could try HRMSR053 again - it's been running now for a little over an hour.  The daily version of HRMSR053 ran okay last night... so at least with an early morning start of HRMSR053, we'll maybe know during working hours if it will finish in 7 hours like it historically did before the upgrade.  Seems odd that the daily version is running okay, but the monthly was doing who know what for 25+ hours and still didn't finish.  I think we may have to have

Rich/Greg delete the HRMSR053 jobid that is stuck in  "KILLING" status from the so_job_queue table in AWPROD.

 

Greg/Rich,

Please delete jobid 5851789 (HRMSMGMT.HRMSR053_01) from so_job_queue table - this chain component is stuck in KILLING status.  I've tried deleting the entire chain, but nothing happens.  No "operations" are available for 5851789 - i.e. cannot directly DELETE, KILL, RESET, etc for this chain component.

Janice.

 

03/04/11.

REP-0110: Unable to open file 'HRMSR053.rdf'.

REP-1070: Error while opening or saving a document.

REP-0110: Unable to open file '/app/oracle/apps/12/hrprodappl/csuh/12.0.0/reports/US/HRMSR053

 

The "daily" version of HRMSR053 has completed:

Started:2011-03-07 08:26:41.0,Finished:2011-03-07 13:03:05.0, Elapsed: 04:36:24

At least for the "daily" version there was no improvement in runtime, with previous post-upgrade executions of the "daily" version completing with similar, but slightly shorter, elapsed times:

3/3/11  - 4:04:50

3/2/11  - 3:57:24

2/28/11 - 4:16:48

We were however running the "daily" version during the day so perhaps the increased runtime may have something to do with hrprod being busier due to online activity/competition.

The "monthly" version of HRMSR053 has been running now for a little over 5 hours.  One thing we learned last week was that the "monthly" version and "daily" version do not walk the same logic, so hopefully we will see improvement in runtime for the "monthly" version. 

Janice.

 

I am following this program in Enterprise Manager and it not getting stuck in the sql statement that it was last week. It is still running that code, but it is doing it much faster. So, it appears to be working much better in that regard.  Time will tell.

Craig P.

 

 

 

Aborted Module Name:   HRMSSUMC_02.HRMSR053_01

  Date:        Day:      Time:          Resolution:

03/04/11     Fri          22:31           See follow up below.

 

Error log and follow up comments:

 

 

REP-0110: Unable to open file 'HRMSR053.rdf'.

REP-1070: Error while opening or saving a document.

REP-0110: Unable to open file '/app/oracle/apps/12/hrprodappl/csuh/12.0.0/reports/US/HRMSR053.rdf'.

 

Program exited with status 1

Concurrent Manager encountered an error while running Oracle*Report for your concurrent request 6641407.

 

This has been fixed.

-Bob-

 

The "daily" version of HRMSR053 has completed:

Started:2011-03-07 08:26:41.0,Finished:2011-03-07 13:03:05.0, Elapsed: 04:36:24

 

At least for the "daily" version there was no improvement in runtime, with previous post-upgrade executions of the "daily" version completing with similar, but slightly shorter, elapsed times:

3/3/11  - 4:04:50

3/2/11  - 3:57:24

2/28/11 - 4:16:48

We were however running the "daily" version during the day so perhaps the increased runtime may have something to do with hrprod being busier due to online activity/competition.

 

The "monthly" version of HRMSR053 has been running now for a little over 5 hours.  One thing we learned last week was that the "monthly" version and "daily" version do not walk the same logic, so hopefully we will see improvement in runtime for the "monthly" version. 

Janice.

 

I am following this program in Enterprise Manager and it not getting stuck in the sql statement that it was last week. It is still running that code, but it is doing it much faster. So, it appears to be working much better in that regard.  Time will tell.

Craig P.

 

 

 

Aborted Module Name:  ADMSSCOR.LYNX_01

  Date:        Day:      Time:          Resolution:

03/11/11     Fri          06:04           Restarted by ITS.

 

Error log and follow up comments:

 

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Alert!: Unable to connect to remote host.

 

lynx: Can't access startfile https://wsnet.colostate.edu/ai/tools/RecruitmentPlus/TestScores.aspx

***

*** /appworx/out/ADMSSCOR.LYNX_01.status.txt ***

 

URL=https://wsnet.colostate.edu/ai/tools/RecruitmentPlus/TestScores.aspx (GET)

STATUS=HTTP/1.1 200 OK

***

[101] : *** ERROR Detected in Output : File Empty ***

+ err=101

 

The web services connectivity has been resolved.

Please restart ADMSSCOR.

Phil.

 

Phil was just asking me if the ADMSSCOR ran okay after it was restarted earlier this morning.  I think the resolution message(s) need to be sent to all of the recipients who received the original “follow-up/troubleshooting required” email(s).  IS Developers (such as Phil/Rami/Bev) who are responsible for production follow-up are not necessarily “watching” Applications Manager, so including them on the “issue resolved” emails will be helpful.

Janice.

 

 

 

Aborted Module Name:   OSYSJOBS_06.OSYSPURG_01

  Date:        Day:      Time:          Resolution:

03/18/11     Fri          16:32           Restarted by ITS.

 

Error log and follow up comments:

 

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> find /nor_orautl/kfs4test -type f -mtime +7 -print

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> 1>> /ais02/dat/work/prod/OSYSJOBS_06.OSYSPURG_01.5945291.5945294.00_too_old

find: 0652-019 The status on /nor_orautl/kfs4test is not valid.

<#/ais02/job/temp/sys_purg_rsh.ksh.982#> errtrap_rsh /ais02/job/temp/sys_purg_rsh.ksh 1

Remote Shell errtrap_rsh parm 2 value is 1

<#errtrap_rsh.87#> [[ 1 > 0 ]]

<#errtrap_rsh.87#> exit 1

<</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1

Remote Shell errtrap_rsh parm 2 value is 1

<<errtrap_rsh.3>> [[ 1 > 0 ]]

<<errtrap_rsh.6>> print *** \n*** ERROR: Empire SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Empire SCRIPT ABORTED - EXIT CODE=1

 

Joleen discovered that Janice’s e-mail on November 1st instructed us to restart these if it said, “Status is not valid”.  So, I restarted these.

Dawn.

 

I think this one might be caused by a bad link on Empire – the /orautl/kfs4test link points to /nor_orautl/kfs4test:

# ls -l | grep kfs4

lrwxrwxrwx    1 oracle   dba              20 Mar 18 14:36 kfs4test@ -> /nor_orautl/kfs4test

But, /nor_orautl/kfs4test doesn’t exist:

# ls -l /nor_orautl/kfs4test

ls: 0653-341 The file /nor_orautl/kfs4test does not exist.

 

Rich/Greg,

Can you check this out – and delete link and/or work with dba(s) to determine if it should be there?

Janice.

 

Shawn renamed kfs4test to kfstest4 on Norrie’s orautl.

So the link on Empire should be ok now.

Now sure if this needs to be restarted now that the problem is fixed.

Rich.

 

By the way, I didn’t notice this at first… but eventually  the OSYSPURG was failing because we were trying to run two OSYSPURG components at the same time for the same host.  This occurred because we had two failed OSYSJOBS chains  for the same schedule (OSYSJOBS_06) “backlogged” – and restarting them simultaneously resulted in stepping on each other’s toes.   In the future, one of the duplicate “backlogged” chains should just be deleted (usually the oldest one), and then the failed component in the remaining chain for the schedule can be restarted, after troubleshooting.  This is the course of action which I took for the last OSYSJOBS_06.OSYSPURG_01 failures this morning. 

NOTE:   This method is only applicable for multiple occurrences of the **SAME** schedule, not failed OSYSJOBS chains for different schedules.

Janice.

 

 

 

 

Aborted Module Name:   AREGDYTR.CONVERT_PDFTOPS_01

  Date:        Day:      Time:          Resolution:

03/29/11     Tue        07:20           Restarted by ITS per Josh’s instructions.

 

Error log and follow up comments:

 

Module AREGDYTR.CONVERT_PDFTOPS_01 has aborted.  No output file.

The before Condition Details indicate it is checking for a file that does not exist:

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

 

Here is how we need to proceed:

Go ahead and skip (delete) the CONVERT_PDFTOPS and SPOOL_FILTER_01 modules.

Then let the rest of the chain run. (AREGS604 on)

Josh.

 

 

 

 

 

Aborted Module Name:   AGENDYHB.SRRSRIN_01

 

  Date:        Day:      Time:          Resolution:

03/29/11     Tue        19:04           Restarted by ITS.

 

Error log and follow up comments:

 

 

Username:

Password: Connected.

 

VOID TIMER(int timemode: 1)

RUN SEQUENCE NUMBER:

Parameter 01 HRMS Read from Job Submission Parameter 03 A Read from Job Submission Parameter 04 N Read from Job Submission Parameter 02 was not found in Job Submission Parameter 99 55 Read from Job Submission Parameter 05 was not found in Job Submission

 

ORA-08176: consistent read failure; rollback data not available

ORA-06512: at "BANINST1.GOKCMPK", line 914

ORA-06512: at "BANINST1.GOKCMPK", line 2327

ORA-06512: at "BANINST1.GOKCMPK", line 2717

ORA-06512: at line 1

 

WRN-ORACERR: Error occurred in file "srrsrin.pc" at line 2,798

WRN-ERRSTMT: Following statement was last statement parsed:

    declare birth_yr varchar2 ( 4 ) := '' ; BEGIN if ( ( :birth_year is nu srrsrin terminated with error

22 lines written to /appworx/out/AGENDYHB.SRRSRIN_01.5999779.5999791.00.2242947.lis

 

I would like to try and just restart this step and see what happens.

Vicki.

 

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

03/30/11     Wed        06:03           See notes below.

 

Error log and follow up comments:

 

 

Error Message:

2011-03-30 06:14:15,619 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXAPEI.electronicInvoiceExtractStep.6003882.6003885.00 steps: [electronicInvoiceExtractStep] 2011-03-30 06:14:15,619 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) java.lang.OutOfMemoryError RunBatch ERROR: Exception found:

java.lang.OutOfMemoryError

 

Is it possible that we need to override the catalina  memory option for this java program, increasing it from the default 1g to 2g?

I'm assuming that it will be okay to just restart this component - or is there any data cleanup necessary?  If okay to restart, then IT Scheduling may proceed as follows:

1)  In **BACKLOG**, provide the following value for prompt 5 "Catalina Opts Memory Override - blank to use default value of 1g, 2g for 2gb, etc." on the aborted KFSXAPEI.KFSX_JAVA_01 component:

2g

2)  Restart the aborted KFSXAPEI.KFSX_JAVA_01

 

The 2g value will also need to be provided as a permanent change to the KFSXAPEI.KFSX_JAVA_01 component in the chain definition.

Janice.

 

I had to manually removed the list of *.processed files in the Staging directory. Please restart the failed component.

John.

 

Did you have to remove the *.processed files because there was no commit prior to the java program failure?  In other words, all the xml files which had processed prior to the failure were not committed and therefore removal of their associated .processed file(s) was necessary to ensure that these xml files would be re-processed upon restarting the aborted java program?

Janice.

 

Correct. We can sign into KFSPRD and check to see if PREQs (payment requests) are created by user KFS which is a system user name. I also checked for EIRT documents (Electronic Invoice Reject Document). So the eInvoice xml files are loaded and create either the PREQ or EIRT. There were no documents with Initiator of KFS, so I knew that the job rolled back the transactions, even though we had *.processed on almost all *.xml files.

John.

 

 

 

Aborted Module Name:   AREGHRTM_SM.AREGS415_01

 

  Date:        Day:      Time:          Resolution:

04/04/11     Mon        09:02           See follow up below.

 

Error log and follow up comments:

 

 

ERROR at line 1:

ORA-29280: invalid directory path

ORA-06512: at "SYS.UTL_FILE", line 41

ORA-06512: at "SYS.UTL_FILE", line 478

ORA-06512: at line 94

Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options

 

This has been resolved.  I made a slight change to the jobprd@banprod login because I was trying to get 'CHECK CONNECTION' feature to work for AREGRTWL chain condition logic - but it ended up causing other scripts to fail :)  I switched it back and am throwing in the towel in on using 'CHECK CONNECTION' feature for AREGRTWL.  For more info, see the IS NEWS email reply which I'm about to send....

Regarding the AREGRTWL_SM/SP DB ERRORS that Dawn reported:

These were related to the Banner shutdown.  Since this chain uses subvars with underlying sql against BANPROD as Chain prompts, we really need to check for the /ais01/dat/misc/prod/BANPROD_shutdown_for_maint in CHAIN **BEFORE** conditions.  I'm actually surprised that we didn't get a Launch error, instead of the DBERROR.  At any rate, I've added the following **CHAIN** BEFORE condition to the AREGRTWL_SLEEP_WAKE_PROCESS chain:

Check for the /ais01/dat/misc/prod/BANPROD_shutdown_for_maint - if exists, CANCEL CHAIN.

 

I tried to also add a 'CHECK CONNECTION' CHAIN BEFORE condition to verify the jobprd@banprod oracle connection but couldn't get the 'CHECK CONNECTION' feature to work properly.

 

From some testing which I've done this morning, it appears that this CHAIN BEFORE condition will execute before an attempt is made to evaluate subvars associated with chain prompts.

Hopefully this condition will prevent a reoccurrence of the situation which occurred Sunday morning.  Keep in mind that the AREGRTWL_SLEEP_WAKE_PROCESS chain runs every 30 minutes, 24-7.  Therefore, unless the chain is failing every 30 minutes, it would be okay to delay Sunday follow-up for similar situations until the next working day because subsequent iterations would have successfully run (every 30 minutes).  Next working day follow-up would consist of deleting the chain - no need to restart it since many iterations (every 30 minutes) would have already run.   The same would be true for other frequency scheduled chains, such as AREGHRTM_SECTION_ENROLLMENT, which runs every hour, 24-7.

If waiting until next working day for follow-up on Sunday DBERROR(s) from frequency scheduled chains, the DBERROR pages would not continue on Sunday because the APWXCHK_BACKLOG component of the APWXCHCK_HOURLY_SYSTEM_CHECK chain (which sent that page), is skipped based on the SUNDAY_ROLLF calendar.  Therefore, once the virtual day for Sunday begins at 10 A.M., then DBERROR/LAUNCH ERROR pages will **not** be sent for the remainder of Sunday (or the rolled forward "Sunday" for holidays) until 10 A.M. the next working day.

Janice.

 

 

 

 

Aborted Module Name:   AROSDGLI.AROSS162_01

  Date:        Day:      Time:          Resolution:

04/29/11     Mon        01:35          Restarted by ITS.

 

Error log and follow up comments:

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.ROKODST", line 362

ORA-00001: unique constraint (FAISMGR.RORNCHG_INDEX_01) violated

ORA-06512: at "ODSMGR.ROKODST", line 16

ORA-06512: at line 1

ORA-06512: at "ODSMGR.GOKODST", line 69

ORA-06512: at "TAISMGR.TT_TBRACCD_INSERT_ODS_CHANGE", line 24

ORA-04088: error during execution of trigger

ORA-06512: at "BANINST1.DML_TBRACCD", line 68

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1685

ORA-06512: at line 1707

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

***

 

Since this error is based on a lock, there are no commits, the process rolled back.

Therefore I would like to request that this process be restarted.

Josh.

 

 

 

Aborted Module Name:  AROSDRFD_RC.AROSS002_01

  Date:        Day:      Time:          Resolution:

04/29/11     Mon        13:33          See follow up below.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated

ORA-06512: at line 1706

ORA-06512: at line 1988

 

Steve located more information in the AROSDRFD_RC.AROSS200_01.utl_file1:

 

Current term: 201110

6 terms ago: 200910

Error when converting comm to student for pidm: 10704262 : ST :ORA-20100: ::Invalid Address code and sequence.::

******************************************************

*****Commercial accounts with default CD profile created *****

 

Can we proceed to the next job chain?

Jacque Clark.

 

I am looking into why it failed.

I would like to verify that this issue is isolated to test and will not appear in prod.

Can we wait a bit before proceeding?

Josh.

 

Yes.  Please let me know if there is anything I need to do.

Jacque Clark.

 

It appears that the selection is not valid.

These definitions do not exists.

13:33:33 SQL> define AROS_SELECTION=AROSDRFD_RC_REFUND_ACH

13:33:33 SQL> define AROS_F_SELECTION=AROSDRFD_RC_EX_REFUND_ACH

It appears that the RC chain is running, I believe the prompts should be fed as follows:

AROS_SELECTION:  AROSDRFD_RC_REFUND_CHECK

AROS_F_SELECTION: AROSDRFD_RC_EX_REFUND_CHECK

 

Then the job can be restarted.

The selection should also be changed in the TSRRFND modules as well.

I am assuming those values are set since the chain has been brought in.

Josh.

 

 

 

 

 

Aborted Module Name:   AREGORGN.AREGS518_01

 

  Date:        Day:      Time:          Resolution:

05/09/11     Mon       14:39           See follow up below.

 

Error log and follow up comments:

 

 

11141703 828337791 Roberts, Andrew                201090 63277 MATH 261  R    RD        RF

11141753 828338155 Kelly, Larissa                 201010 11516 BZ   110  T    RD        RF

11142893 828346163 Ansah-Twum, Derek              201090 62137 CHEM 111  T    RD        RD

ERROR at line 1:

ORA-20000: ORU-10027: buffer overflow, limit of 100000 bytes

ORA-06512: at "SYS.DBMS_OUTPUT", line 32

ORA-06512: at "SYS.DBMS_OUTPUT", line 97

ORA-06512: at "SYS.DBMS_OUTPUT", line 112

ORA-06512: at line 327

 

14:39:36  32             ,g.shrtckg_gmod_code                               GRADE_MODE

14:39:36  97         AND r.sfrstcr_pidm        = d.swrgpcd_pidm

14:39:36 112                                         AND s.ssbsect_crse_numb = TRIM(SUBSTR(d.swrgpcd_attr10

 

There should be a standard for enabling output.

I think it is:

Set serveroutput on size 1000000

 

In AREGS518, 1000000 was set to 100000 instead.

Josh.

 

 

Aborted Module Name:   HRMSS230.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

05/14/11      Sat       04:22             See follow up below + attached email.

 

Error log and follow up comments:

 

#   SRC FILE       : /ais01/ftp/to/user/HRMSS230.VSP.DAT

#   DST FILE       : g0021702@ftp.vsp.com:/prod/g0021702

#   IDENTITY       : /home/jobprd/.ssh/csu_to_vsp-4096-20100924

#   DIR HOST       :

#   DIR LOCAL      :

#   CHMOD          :

# > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> put /ais01/ftp/to/user/HRMSS230.VSP.DAT /prod/g0021702 # > Uploading /ais01/ftp/to/user/HRMSS230.VSP.DAT to /prod/g0021702 # > sftp> -ls -l /prod/g0021702 # > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> exit # > (0) # > (0)

#------------------------------------------------------------------------------

# RETURN CODE = 0

#==============================================================================

 

It appears that the file was uploaded to the vendor successfully, but could someone on the HR team contact the vendor to confirm this? 

The job failure was due to the "Permission denied" messages when trying the "list" (-ls -l) the file on the vendor's site:

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Previous executions of this chain indicate that we have not received the "Permission denied" message when attempting to "list" the file before and/or after the upload.

**Has something changed on the vendor's side which is preventing us from performing this   command - and causing this error message?

Once the confirmation has been received that file upload was successful to vendor, then this failed component may be deleted to allow the chain to complete.  If the file was not transmitted successfully, then this failed component should be restarted - however, if the "permission denied" error message is produced when trying to list the file, the component will fail again

Janice.

 

Do you have a contact at VSP that can verify they received this file?

Steve H.

 

I sent out two separate emails this morning and have not yet received a response.  I'm very confident that they got the file and that the ABORT was due to a permissions change on their directory.

I'll let everyone know as soon as I get a response from VSP

-Bob-

 

Earlier this afternoon, Bob V. and I discussed this issue and decided that if VSP did not respond before Bob leaves for the day (at 3:30 P.M.), then we would assume that the vendor received the file and delete the aborted HRMSS230.SSH_SFTP_01 component - thereby allowing remainder of the chain to complete.

Please proceed with deleting HRMSS230.SSH_SFTP_01 from backlog as soon as possible.

Janice.

 

 

 

Aborted Module Name:   HRMSR188.SEND_MAIL_OAE_01

  Date:        Day:      Time:          Resolution:

05/16/11     Mon       22:18          Restarted by ITS.

05/17/11     Tue        07:46          Restarted by ITS.

 

 

Error log and follow up comments:

 

05/16/11.

# FATAL : Error opening address file (SEND_MAIL_HRSAO_BENEFITS.LST) : A file or directory in the path name does not exist.

***

Please correct the recipients prompt in backlog for HRMSR188.SEND_MAIL_OAE_01 from SEND_MAIL_HRSAO_BENEFITS.LST To /ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST

 

Also, followup by correcting the HRMSR188.OAE-FEEDBACK_01 chain component prompt #6 (Mailing List) value from:

SEND_MAIL_HRSAO_BENEFITS.LST

to:

{#mailst}/SEND_MAIL.HRSAO_BENEFITS.LST

 

I think we all missed the typo error of SEND_MAIL_ (when it should have been SEND_MAIL.), but we did previously discuss the need for the path, {#mailst}/, to be included within this prompt value.

Janice.

 

05/17/11.

# Passing Parms : arg=[ from="appworx@mailer.is.colostate.edu" reply_to="" to="/ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST" cc="" bcc="" subject="HRMSR188 Completed" --options=" ERROR -999 ORA-01722: invalid number" --options=""] /usr/bin/perl /appworx/csu/exec/SENDMAIL.PL  from="appworx@mailer.is.colostate.edu" reply_to="" to="/ais01/dat/misc/mailst/SEND_MAIL.HRSAO_BENEFITS.LST" cc="" bcc="" subject="HRMSR188 Completed" --options=" ERROR -999 ORA-01722: invalid number" --options=""

#==============================================================================

# [ 2011.05.17-07:46:30 ]

#******************************************************************************

# FATAL : < main::parse

# FATAL : Unknown option ( ERROR -999 ORA-01722: invalid number)

 

Sometimes we see this problem with the SEND_MAIL and the multiselect parameter - there's usually no easy way to fix this, as we get an AWE-9999 Internal error if we try to click on the "Select" button for the Options prompt - i.e. it's a catch-22, it doesn't like the Ref=null value, but it won't let us edit the prompt.

Since this chain simply produces a report, I suggest that we just re-run the HRMSR188 chain - ***make sure to fix the OAE-FEEDBACK prompt in the chain first***, as described in earlier email.  Delete this failed HRMSR188.SEND_MAIL_OAE_01 component to let the chain complete and then request the chain to run again. 

Janice.

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

05/17/11     Tue        06:03          Restarted by ITS.

05/17/11     Tue        14:16          Restarted by ITS.

Error log and follow up comments:

 

06:03.

2011-05-17 06:09:15,881 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) ; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) RunBatch ERROR: Exception found:

org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) ; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800) Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

 

Can you please re-run this job. There is a single xml file that is bad and causing the abort. I have renamed this to 606788404omax1_30818MAY1611_1305630608197614015.xml_badfile so that I can take a look at the problem. However, if you re-run the job then all the other xml files should get processed.

 

I will make sure the corrected *.xml file gets placed into /vendorfiles/einvoice with the corresponding .done file.

John.

 

14:16.

* SQLException during execution of sql-statement:

* sql statement was 'INSERT INTO KRNS_NTE_T (NTE_ID,OBJ_ID,VER_NBR,RMT_OBJ_ID,AUTH_PRNCPL_ID,POST_TS,NTE_TYP_CD,TXT,PRG_CD,TPC_TXT) VALUES (?,?,?,?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.rice.kns.bo.Note'

* PK of the target object is [noteIdentifier=727543]

* Source object: note(noteIdentifier)=(727543)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."KRNS_NTE_T"."TXT" (actual: 1082, maximum: 800)

               at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

 

I found 4 potential files causing the error. I have renamed these so that they do not get processed.

Please re-run this job. If I have isolated the problem file, then it should finish.

4 potential bad files:

159148746_9538764508_1305626449968863369.xml

159148746_9538764516_1305626452665289181.xml

159148746_9538764524_1305626452300099449.xml

606788404omax1_25644MAY1611_1305630765656499603.xml

John.

 

 

Aborted Module Name:   FAIDPACK_OD.LYNX_01

  Date:        Day:      Time:          Resolution:

05/26/11     Thu        04:15           See follow up from Janice below.

Error log and follow up comments:

 

STATUS=HTTP/1.1 200 OK

   URL=http://wsprod.colostate.edu/cwis231/autorun/parent_tknt_email.cfm?ay=FAIDTKNT_EV (GET)

STATUS=HTTP/1.1 503 Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

+ orig_log_run=Y

+ export orig_log_run

+ log_run=Y

 

I forwarded this to Tom Biedscheid's group and he requested that we restart the failed component, which I've already done.  According to Tom, a similar problem occurred a couple weeks ago and restarting was the solution that time :) FAIDTKNT_EV HAS COMPLETED.

Janice.

 

 

 

Aborted Module Name:   OSYSJOBS_11.OSYSLLNK_01

 

  Date:        Day:      Time:          Resolution:

05/22/11     Tue        16:32          Restarted by ITS.

 

Error log and follow up comments:

 

 

<</ais02/job/prod/kshexe_rsh.70>> sys_llnk_rsh.ksh <#/ais02/job/prod/sys_llnk_rsh.ksh.23#> alias log=echo "*** " $(date +%m/%d/%Y-%T) <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> find / -name tmp -prune -o -name proc -prune -o -type l -ls <#/ais02/job/prod/sys_llnk_rsh.ksh.24#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_09.OSYSPURG_01.6302011.6302014.00_opmn_logs is not valid.

<#/ais02/job/prod/sys_llnk_rsh.ksh.24#> errtrap_rsh /ais02/job/prod/sys_llnk_rsh.ksh 1 Remote Shell errtrap_rsh parm 2 value is 1 <#errtrap_rsh.21#> [[ 1 > 0 ]] <#errtrap_rsh.21#> exit 1 <</ais02/job/prod/kshexe_rsh.70>> errtrap_rsh kshexe_rsh 1 Remote Shell errtrap_rsh parm 2 value is 1 <<errtrap_rsh.3>> [[ 1 > 0 ]] <<errtrap_rsh.6>> print *** \n*** ERROR: Guffey SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Guffey SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_rsh.7>> exit 1

+ grep SCRIPT ABORTED

+ /ais02/log/OSYSJOBS_11.OSYSLLNK_01.6302027.6302032.00.2011_05_22_1632.

 

I think this might be one of those failures where you just restart the job.

I think Janice gave information on this at one time.

I pasted the information in this email for you.

Rich.

 

I'm forwarding this email which I sent early in November regarding a couple OSYSLLNK failures.  Please review this email for instructions on troubleshooting OSYSLLNK failures and as was indicated near the end of the email:

Sometimes we see these "status is not valid" messages for the find command - generally caused by a file being deleted after the find command "found" the filename. Usually, we can just restart jobs in this situation,....

Based on the previously communicated information, IT Scheduling could have just restarted the two failed OSYSLLNK components this morning because they both failed with similar 'The status on "filename here" is not valid' errors.

Janice.

 

 

 

Aborted Module Name:   HRMSWKSP_01.AROSS142_01

  Date:        Day:      Time:          Resolution:

05/23/11     Mon       22:16          Restarted by ITS.

 

Error log and follow up comments:

 

22:16:56 528  --Document the end of the program.

22:16:56 529  DBMS_OUTPUT.PUT_LINE

22:16:56 530          ('**** End   of AROSS142 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

22:16:56 531  end;

22:16:56 532  /

old   5: vuser           twraccd.twraccd_user%type := '&&p_user';

new   5: vuser           twraccd.twraccd_user%type := 'FRANKMTZ';

old   6: vstart_date     date := '&&p_start_dateDD_MON_YYYY';

new   6: vstart_date     date := '06-may-2011';

old   7: vend_date       date := '&&p_end_dateDD_MON_YYYY';

new   7: vend_date       date := '20-may-2011';

old   9: WS_OBJECT_CODE      varchar2(40) := '&&P_WS_OBJECT_CODE';

new   9: WS_OBJECT_CODE      varchar2(40) := '4401';

**** Start of AROSS142 05/23/2011 22:16:57 Batch Number: WSFRANKMTZ2011050019 Account Does Not Exist: 827855071 6464040 4401 declare

*

ERROR at line 1:

ORA-20100: Account Does Not Exist in TBRACCT, Account B

ORA-06512: at line 266

ORA-06512: at line 499

 

This process is pulling data from HR and creating a TWARBUS batch to create invoices for the employers.  (Work Study) This person is the problem:  827855071 6464040 4401 The account object code does not exist in TBRACCT, specifiaclly tbracct_account b.  This means a detail code needs to be set up if the users want to apply transactions to this account through TWARBUS.

Generally to fix this we would contact Frank Martinez.  Give him the record and ask if we can skip it to continue the schedule.

If he agrees, we should put a temp version of AROSS142 out on kebler and comment out the raise_application_error component of the error message below so that we log the problem record but don't abort the program.

Then we can move on with the schedule.

Josh.

 

I put a temporary version of AROSS142 in /ais01/src/sql/temp on Kebler as Josh suggested.  Please continue the aborted job.

Rob.

 

 

 

 

 

Aborted Module Name:   HRMSS230.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

05/31/11     Tue          00:30          Deleted by ITS.

 

Error log and follow up comments:

 

 

# > Couldn't stat remote file: Permission denied # > Couldn't stat remote file: Permission denied

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

 

It appears that the file was uploaded to the vendor successfully, but could someone on the HR team contact the vendor to confirm this? 

The job failure was due to the "Permission denied" messages when trying the "list" (-ls -l) the file on the vendor's site:

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Previous executions of this chain indicate that we have not received the "Permission denied" message when attempting to "list" the file before and/or after the upload.

**Has something changed on the vendor's side which is preventing us from performing this   command - and causing this error message?

Once the confirmation has been received that file upload was successful to vendor, then this failed component may be deleted to allow the chain to complete.  If the file was not transmitted successfully, then this failed component should be restarted - however, if the "permission denied" error message is produced when trying to list the file, the component will fail again.

Janice.

 

Bob,

Do you have a contact at VSP that can verify they received this file?

Steve.

 

I will contact them now and I'll keep you all posted.

I sent out two separate emails this morning and have not yet received a response.  I'm very confident that they got the file and that the ABORT was due to a permissions change on their directory.

I'll let everyone know as soon as I get a response from VSP.

-Bob-

 

Earlier this afternoon, Bob V. and I discussed this issue and decided that if VSP did not respond before Bob leaves for the day (at 3:30 P.M.), then we would assume that the vendor received the file and delete the aborted HRMSS230.SSH_SFTP_01 component - thereby allowing remainder of the chain to complete.

Please proceed with deleting HRMSS230.SSH_SFTP_01 from backlog as soon as possible.

The issue that occurred 2 weeks ago when HRMSS230 failed happened again last Friday night.  The SSH_SFTP step will continue to fail with the permissions problems until this issue is resolved with the vendor.  I suggest that the contact be made once again with the vendor to 1) confirm if they received the file from Friday night, and 2) to resolve what has changed on their end which causes the permissions problems when the SSH_SFTP process attempts to issue  an "ls" on the file (on their server):

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied

Janice.

 

1)  The vendor, VSP, just informed me that they did receive the file on Friday.

2)  I've already sent an email to Elden to ask him to contact the EDI team at VSP to get this resolved.

-Bob-

 

 

 

 

Aborted Module Name:  AREGDPTR_UG.AREGS607_01

 

  Date:        Day:      Time:          Resolution:

05/28/11     Tue         00:08          See follow up below.

 

Error log and follow up comments:

 

 

ORA-20100: ::An exact data code already exists for this person.::

ORA-06512: at "BANINST1.CSUG_API_GP_DTCD_RULES", line 2622

ORA-06512: at "BANINST1.CSUG_API_GP_DTCD", line 1256

ORA-06512: at line 418

 

AREGDPTR_UG.AREGS607_01 is complete.

David.

 

Looks like the AREGDPTR_UG  & GR  finished.  YEA!!!!

 

Wondered if I could go to mainsite and pick up the labels this afternoon instead of waiting until tomorrow’s delivery?  Thank you for getting this resolved.  J

Denise.

 

The diploma transcript process failed Friday night.  The problem seems to be a duplicate 8021/DIPLTRNPRT data code for student 824511472 (pidm is 10630196).  Could you please delete one of these rows and then we can restart the job?

Rob.

 

 

 

Aborted Module Name:  HRMSKFS_QPH.HRMSS175_01

 

  Date:        Day:      Time:          Resolution:

06/01/11     Wed       13:59          See follow up below.

 

 

Error log and follow up comments:

 

 

# open     : Open Host (ftp.dbman.com)

#******************************************************************************

# FATAL : Cannot connect to (ftp.dbman.com): Net::FTP: connect: A system call received a parameter that is not valid.

#------------------------------------------------------------------------------

# USAGE: /appworx/csu/exec/FTP_ENHANCED.PL \

#     remote_host=override_host_name\

#     transfer_mode=transfer_command\

#     translate=translation_mode\

#     src_file=fully_qualified_source_file\

#     dst_file=fully_qualified_destination_file\

#     site_options=comma_delimited_site_options\

#     local_options=semicolon_delimited_local_options\

#   transfer_mode values

#     append, dir, get, put, recv, send, submit

#   translate values:

#     ascii, binary, ebcdic

#   site_options values

#     comma delimited site options

#       RECFM=FBA,LRECL=133,BLKSIZE=3325

#   local_options values

#     semicolon delimited local options

#       active | passive | cd=remote_dir

#   Also, these environment variable must be set

#     net_connect, db_login, db_password

#------------------------------------------------------------------------------

 

I think you've got the aborted module name wrong in this email - shouldn’t it be HRMSDIRC_VT.FTP_AIS01_DIR_01? Elden isn't here today, but maybe the HR Team can check with Juliana Hissrich to see if anything has changed with the vendor and/or if their server is available?

 

If the HR Team hasn't already contacted Juliana, then I think we need to, at a minimum, send an email to hrsao_printed_directory@mail.colostate.edu to let them know that the ftp of the printed directory test file to vendor failed.

If HR Team has already contacted hrsao_printed_directory@mail.colostate.edu about this problem, then we'll just wait to hear back from Juliana and/or vendor.

Janice.

 

Does the log give a reason why the ftp failed?

Steve.

 

The earlier email traffic includes pretty much all we see in the joblog  -  don't think we're even getting connected.  If you go back to the original email that Robin sent, it would have the Appman joblog attached.

Janice.

 

 

 

Aborted Module Name:   HRMSS230.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

06/01/11     Wed       22:37          Deleted by ITS.

 

Error log and follow up comments:

 

# > Remote working directory: /g0021702

# > sftp> lpwd

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/ftp/to/user/HRMSS230.VSP.DAT

# > -rw-rw----    1 appworx  Gftp         682340 Jun 01 22:31 /ais01/ftp/to/user/HRMSS230.VSP.DAT

# > sftp> -ls -l /prod/g0021702

# > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> put /ais01/ftp/to/user/HRMSS230.VSP.DAT /prod/g0021702 # > Uploading /ais01/ftp/to/user/HRMSS230.VSP.DAT to /prod/g0021702 # > sftp> -ls -l /prod/g0021702 # > Couldn't stat remote file: Permission denied # > Can't ls: "/prod/g0021702" not found # > sftp> exit # > (0) # > (0)

 

This is the same problem that we've been having for the previous two times that this job ran - need to contact vendor to find out why permissions have changed on their end so that our process can send the file, but cannot perform an "ls" to verify existence of file.

Janice.

 

Please delete.  Yes, this is the same old problem (permissions) that will hopefully get resolved soon.

-Bob-

 

I've altered the logic in our global COMPLETION) script, /appworx/exec/COMPLETION, to allow for an exceptions file when scanning *FTP* output listings.  We generally consider  "Permission denied" a fatal error, but I've placed the following into the newly created "allowed exceptions" file, /ais01/dat/misc/prod/errstrg_ftp_exceptions:

Couldn't stat remote file: Permission denied

If anyone thinks the above "exception" message will permit an unacceptable error to slip through undetected, then we'll need to discuss.  Otherwise, this should solve the problem we've been having with HRMSS230.SSH_SFTP_01, which has been successfully transmitting the file, but failing when the COMPLETION script detected "Permission denied" within the output.  HRMSS230_VSP is next scheduled to run this Friday (6/10).

Just to test out the new logic in the COMPLETION script, I restarted the aborted HRMSDIRC_VT.FTP_AIS01_DIR_01 component, which of course failed.  I wasn't expecting this restart to ftp successfully - it was just a convenient way to be sure there were no syntax errors in the *FTP* specific logic within the COMPLETION script.

Janice.

 

 

 

 

Aborted Module Name:   FAIDDLDR_EV.GLBDATA_04

  Date:        Day:      Time:          Resolution:

06/04/11     Sat         23:11          Restarted by David.

 

Error log and follow up comments:

 

SUNGARD HIGHER EDUCATION                                                     

                                                     POPULATION SELECTION EXTRACT                                                   

                                                          CONTROL REPORT                                               PAGE       1

                                                                                                                         

              Start Time: 04-JUN-2011 00:23:13                                                                                     

         GLBDATA Version: 8.3.0.5                                                                                                   

          Selection ID 1: FAIDDLDR_EV_EXIT_SAT                                                                                     

             Application: FINAID                                                                                                    

              Creator ID: FAUSER                                                                                                   

                                                                                                                                    

*ERROR* Dynamic parm FAID_EV_AID_YEAR not found or is null                                                                         

  SQLCODE = 1403                                                                                                                    

SQL ERROR = ORA-01403: no data found

                                                                                              

X01 ROLLBACK SQLCODE=0000                                                                                                          

X01 COMMIT (1) SQLCODE=0000                                                                                                         

SQLCODE = 0000                                                                                                                     

ORA-01403: no data found                                                                                                            

DQY-ABORT ROLLBACK SQLCODE = 0000   

 

FAIDDLDR_EV is complete

David.

 

The down side to this is that it was so late on Sunday that FAID schedule was still running when Sunday night oracle recycle of Banner occurred. 

 This happened part way through FAIDDISB_FA  RPEDISB program – causing it to fail.

 Tom, Is there any problem just doing a restart on this failed component – do we need to worry about “fixing” any data  before we restart due to the failure part way through processing?

Janice.

 

 

No problem restarting when it's a GLBDATA run.

Tom.

 

Uh.. that confused me – it failed in the RPEDISB program, not GLBDATA?

Janice.

 

Got it. I was lost in the subject line. There's no problem restarting RPEDISB, either.

Tom.

 

Oh.. sorry about that, it was “connected” to the earlier fix of FAIDDLDR_EV.GLBDATA_04 because that was restarted so late on Sunday that FAIDDISB ran into the Sunday night database recycle.  I’ve restarted FAIDDISB_FA.RPEDISB_01 – hopefully we can finish the FAID schedule soon!

Janice.

 

 

Aborted Module Name:   FAIDSNTD.WAIT_FOR_CHAINS_01

 

  Date:        Day:      Time:          Resolution:

08/23/11     Tue        08:20           See follow up from Janice below.

10/26/13     Sat         06:10           Restarted by Joleen.

 

Error log and follow up comments:

 

 

1> /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat

+ [[ -s

+ /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat

+ ]] print *** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

*** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

      6800769.01 FAID      TDCLIENT_SEND       08/23 08:22 ABORTED     AWPROD    APPWORX

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

 

Notice that this indicates that a TDCLIENT_SEND spawned module failed.  If you look at that failure, you'll see that it redirects output to the /ais01/dat/work/prod/TDCLIENT_SEND.CRPG11IN.TXT file, which contains the following error message:

WARNING: Failed to connect to server                                           

Error connecting to network SAIGPORTAL                                         

(234) FTP connection attempt failed.                                           

SSL Handshake failed:                                                          

I suggest restarting the failed TDCLIENT_SEND component to see if it can connect to SAIG, then restart the failed FAIDSNTD.WAIT_FOR_CHAINS_01.  The FAIDSNTD is waiting for all the spawned TDCLIENT_SEND modules from its predecessor chains to have completed.  The output reports from all these spawned TDCLIENT_SEND modules are then captured to the FAIDSNTD VistaPlus report.

Let me know if you have questions.

Janice.

 

 

10/26/2013 10:24    JWEARNE

FAIDSNTD.WAIT_FOR_CHAINS_01 11752556 10-26-2013 06:10:43

MDT    202 ABORTED

 

Only FAIDAM99 was left to run so I restarted FAIDSNTD and it has finished running.

 

*** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

     11753072.00 FAID      FAIDPLOD_PELL_ORIG_D10/26 06:09 FINISHED    AWPROD

+ exit 1

+ err=1

 

The run flags for FAIDPLOD are N for both schedules.

Here is the wait for chains file:

BROWSE --- /ais01/dat/apmx/prod/TDCLIENT_SEND_SPAWNED_FROM_PROCESS_FLOWS

********************************* TOP OF DATA FAIDCORR FAIDDLPL FAIDDLST FAIDPLOD FAIDTMEX

******************************** BOTTOM OF DATA

 

All FAID jobs were done running. I restarted FAIDSNTD and it completed.

Joleen.

 

 

 

Aborted Module Name:   HRMSDEM_SAL.HRMSR060_01

 

  Date:        Day:      Time:          Resolution:

08/25/11     Thu        12:55           see follow up below.

 

Error log and follow up comments:

 

Enter Password:

REP-0069: Internal error

REP-57054: In-process job terminated:Finished successfully but output is voided

 

Report Builder: Release 10.1.2.2.0 - Production on Wed Aug 24 16:57:27 2011

 

Delete it.  If we need to we can run it later through the application since there are no Vista plus implications.

Steve H.

 

I deleted the HRMSDEM_SAL.NOTIFY_FOR_APWX_01 component from backlog, deleted the #AW99_6808259 subvar, then deleted the aborted HRMSDEM_SAL.HRMSR060_01 component, thereby allowing the CHAIN_FINISH component to run.  HRMSDEM_DEMAND_DEPOSIT_ADVICES has completed.

By the way, the reason I deleted the #AW99_6808259 chain-specific subvar (which had the value of HRMSSAL4) was so that the HRMSDEM_SAL.CHAIN_FINISH_01 component would not write an HRMSSAL4_CHAIN_FINISH_HRMSDEM_SAL... entry in /ais01/dat/apwx/prod.  We've already allowed HRMSSAL4.CHAIN_SUMMARY to proceed and pick up all the related HRMSSAL4_CHAIN_FINISH_* entries for the Salary Phase 4 feedback email -- don't want to create one now for this chain, which could potentially be picked up in the feedback summary the next time Salary Phase 4 runs.

Janice

 

 

 

Aborted Module Name:  DOITHRS1.FTPS_CURL_01

  Date:        Day:      Time:          Resolution:

07/07/11     Thu        17:01           Restarted by Janice.

 

Error log and follow up comments:

 

 

DOITHRS1.FTPS_CURL_01 ABORTED last night with the error shown below.  There must have been a connectivitiy problem at GGCC at that time - I restarted this component and it finished successfully this morning. 

 

# > * SSL read: error:00000000:lib(0):func(0):reason(0), errno 73 # > * FTP response reading failed # > * Closing connection #0 # > # > curl: (56) SSL read: error:00000000:lib(0):func(0):reason(0), errno 73 # > (56) #==============================================================================

# FATAL : Command failed with code : 56

Janice.

 

 

 

Aborted Module Name:   HRMSCPR_HRL.HRMS_SPAWN_LOG_01

 

  Date:        Day:      Time:          Resolution:

07/08/11     Fri         16:14            See follow up below.

 

Error log and follow up comments:

 

+ 1>> /ais01/dat/work/prod/HRMSCPR_HRL.HRMS_SPAWN_LOG_01.Spawned_Log

+ read this_spawned_req

+ grep C

+ cut -f2 -d ?

+ print 6789789?E

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

***

*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION

***

+ exit 100

 

This is similar to the situation which occurred in the June Salary Phase 2.  HRMSCPR_HRL.HRMS_SPAWN_LOG_01 aborted because it found a concurrent request, spawned by HRMSS064, which is in Error status  - i.e. a spawned request which did **not** successfully complete. 

In this case,  HRL_15-JUL-2011 (Hourly Employee Pre-gen Distribution Lines) request id 6789787 spawned PSP: Import Pre-Generated Distribution Lines (REQUEST ID 6789789)and the Status of the spawned 6789789 is "Error" - see error from 6789789 below.

The program failed with the following error(s) :

This batch has errors. To see the error messages please use Pre-gen distribution lines form.

Janice.

 

I fixed the first fail and pre-gen has finished but it did not go to the balance report?

Vickie.

 

The first failure caused the HRMSS064 (Hourly Employee Pre-gen Distribution Lines) program to fail.  After you fixed that, HRMSS064 was restarted and did finish successfully.  However, there's more to the equation than HRMSS064 completing successfully, as this program submits other concurrent requests to perform some of the "work"... and it was one of these spawned concurrent requests that failed as noted in the earlier email.  Viewing request id 6789789 within HR may provide more information to you regarding the reason for the failure and how to proceed.  The Phase 2 process will not move forward with the next step (Payroll Balance Report) until the aforementioned problem has been resolved.

Janice.

 

 

 

Aborted Module Name: HRMSSQWL.SQWL-LOOP_01

 

  Date:        Day:      Time:          Resolution:

07/12/11     Tue         09:02            See follow up below.

 

Error log and follow up comments:

 

 

Value exceeded allowable range (line 273 of COSQWL_SUPPLEMENTAL)

 

Cause:        Caused by Oracle error 6502 occurring during the execution of the

formula which is raised when an arithmetic conversion error, or string tr

 

+ awrun SEND_MAIL -v HRMSSQWL_SEND_MAIL_CO -arg

+ jobprd@mailer.is.colostate.edu

+ /ais01/dat/misc/mailst/SEND_MAIL.HRMS.APMX.ALERT.LST

+ /ais01/dat/misc/mailst/SEND_MAIL.HRSAO_SQWLARCH_FOLLOWUP.LST

+ /ais01/dat/misc/mailst/SEND_MAIL.HRMS.APMX.ALERT.LST _NULL_

+ PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado _NULL_

+ /ais01/dat/misc/mailst/SEND_MAIL_TEMPLATE.SQWL_FAILURE.TXT

+ SQWLARCH_(SPAWNED_FROM_PYUSSQLWLGRE(ID=6792213) Colorado 6792214

+ /oraapps/hrprod/log/l6792214.req

JOBID: 6577943

+ exit 1

+ err=1

 

The sqwl stuff directly sends a follow-up email to the users (HRSAO SQWLArch Followup), as well as to the Alert HRMS WHRS and Alert APMX lists.  Please refer to the earlier mail, with subject PYUSSQWLGRE(ID=6792213) SPAWNED SQWLARCH PROBLEM - Colorado, dated Tue 7/12/2011 9:58 AM.  Because we have this automated feedback reporting for the SQWL failures, there is no need to also send the normal HRMS ABORT followup email.

Janice.

 

 

 

 

 

Aborted Module Name:   AREGDYTR.CONVERT_PDFTOPS_01

  Date:        Day:      Time:          Resolution:

07/13/11     Wed        07:00           See follow up below.

 

Error log and follow up comments:

 

There is no output file. The before condition says to check for file below and if it does not exist to abort the task every time it is true.

 

{#spool_out}/AREGDYTR.AREGR600.{chain_id}.PDF

 

After talking to Vicki, Steve and I followed the instructions Janice outlined below from the abort book (see page 571 or note from Janice below).

We deleted the failed component and the subsequent Spool Filter. Which allowed the rest of the chain to run. Unfortunately, that meant that the transcripts that were expected were not generated. In order to print those, we changed the subvar #AREGORTR_REPRINT_DATE to today’s date and ran AREGORTR_ONREQ_TRANSCRIPT. There is a note in the chain about it:

COMMENTS:     

NOTE: This chain must have a correct date updated - #AREGORTR_REPRINT_DATE

 

Due to the error in the AREGR600, there was no output PDF file from AREGR600 to be used as input to the CONVERT_PDFTOPS component.  I deleted this failed component, as well as the subsequent SPOOL_FILTER component which would have spooled the “postscript” output from the CONVERT_PDFTOPS component.

 

*** Oracle Report:  AREGR600

    Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: AREGDYTR_DAILY_TRANSCRIPT

*** Oracle Instance: banprod

*** Report Parameters Used:

    req_levl=AL

    p_source=B

    PDFEMBED=YES

*** Report Errors:

    REP-0177: Error while running in remote server

    Unable to retrieve a string from the Report Builder message file.

    REP--002:

Janice.

 

 

 

Aborted Module Name:   AROSDBIO.AROSS141_01

  Date:        Day:      Time:          Resolution:

07/21/11     Thu        18:27          Restarted by Joleen.

04/03/12     Tue        18:06           Restarted by Joleen.

 

Error log and follow up comments:

 

 

07/21/11.

ERROR at line 4:

ORA-01847: day of month must be between 1 and last day of month

ORA-06512: at line 341

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

In order to fix this abort. Rob figured out which record was causing the problem. It turns out that the birthdate requires dd/mm/yyyy. The record in question just had a 5 instead of an 05. The records in this module are created by students filling out a form online. AR doesn't have the option to modify the records and had to delete the problem record. AR had to manually re enter the record for the student. I restarted the module and it finished.

Joleen.

 

04/03/12.

ERROR at line 1:                                                             

ORA-20100: AROSS141 Failure: -20100 ORA-20100: ::E-mail addresses must have a

least 1 character in front of "@" and at least 1 character                   

ORA-06512: at line 104

 

Here’s what /orautl/BANPROD/AROSDBIO.AROSS141_01.utl_file1 says:

Insert failed: 829854495 Phipps -20100 ORA-20100: ::E-mail addresses must have a least 1 character in front of "@" and at least 1 character

Steve G.

 

Thanks for finding the problem record.  Janet was able to edit the record and correct the email address.  Can you start our schedule up again?

Steven.

 

 

 

 

Aborted Module Name:   HRMSKFS_SAL.HRMSS175_01

 

  Date:        Day:      Time:          Resolution:

07/22/11     Fri          15:13          See follow up below.

04/29/14     Tue        14:24          Restarted by Robin.

 

Error log and follow up comments:

 

07/22/11.

5:13:54 1630  If (ctl_sum_salary - l_net - ctl_sum_ee_deductions - ctl_sum_cash_deductions) <> 0 Then

15:13:54 1631            DBMS_OUTPUT.PUT_LINE('###################################################');

15:13:54 1632            DBMS_OUTPUT.PUT_LINE('#####                                         #####');

15:13:54 1633            DBMS_OUTPUT.PUT_LINE('####                                           ####');

15:13:54 1634            DBMS_OUTPUT.PUT_LINE('###        KFS FILE IS OUT OF BALANCE ###');

15:13:54 1635            DBMS_OUTPUT.PUT_LINE('####                                           ####');

15:13:54 1636            DBMS_OUTPUT.PUT_LINE('#####                                         #####');

15:13:54 1637            DBMS_OUTPUT.PUT_LINE('###################################################');

15:13:54 1638  --    RAISE kfs_not_balanced;

15:13:54 1639  End if;

 

The out of balance condition is going to take some time to research.  The plan is to complete salary phase 4 processing on Monday. I am not sure what implications this has for encumbrances.  Does this mean we need to suspend encumbrances until after phase 4 has completed?

Steve.

 

I just cleared up the stall in tonight's HRMS schedule - excerpt from News file follows:

 

I saw Steve Hill's email earlier this evening that the Salary Phase 4 would have to wait until Monday. 

In light of that, we don't want to run encumbrances tonight anyway. 

I'm not sure what happened to the safeguard that we are supposed to have in place to make sure that HRMSAW15 runs before staff leave for the day.  We had this situation (last month, I think)... HRMSAW15 was waiting for a notify file from Salary Phase4, which of course isn't going to be created tonight!  HRMSENCD was waiting for HRMSAW15 - and we would have had a stalled HRMS schedule due to this failure to make sure that HRMSAW15 "stall" was handled during working hours. 

I deleted the waiting HRMSAW15.WAIT_FOR_APWX_SAL4 component to clear out the stall!  HRMSENCD chain has been automatically cancelled (due to Salary running, but Phase 4 not done) - so think we are on track now for HRMS schedule to proceed for tonight.

Janice.

 

04/29/14.

*** Follow-up Required -- Review output included below:

<<include=>>

 

HR test process flow run in AppMan AWTEST did not have a  *** AWTEST ***  prefix in email subject....mail list needs to be checked too, We will add this to our SQL updates list.

How to use SQL update statement to avoid above :)

Gudrun.

 

 

 

 

Aborted Module Name:   KFSXFPDV.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

08/05/11     Fri          00:02          Restarted by Joleen.

 

 

Error log and follow up comments:

 

ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf

/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

+ + ls /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

this_pdf=/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf

/ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf

+ cp /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-43-732.pdf /ais02/app/kfs/prd/work/reports/fp/disbursement_voucher_batch_20110726-22-22-51-63.pdf /ais01/spool/vplus/out/KFSXFPDV.KFSX_JAVA_01.6660543.pdf

cp: 0653-437  /ais01/spool/vplus/out/KFSXFPDV.KFSX_JAVA_01.6660543.pdf is not a directory.

+ exit 1

 

I discussed this earlier today with Josh, but just to bring everyone else up to speed….here’s what happened:

The KFSX_JAVA global script expects only one output pdf file to be created from a java program.  Actually, we have very few java programs which still create pdf output files – most were converted  (by the foundation) from .pdf output to .txt output.  The output files are date-time stamped, so we have logic with the global script to “find” the pdf file – thinking that logic will return just **ONE** pdf filename.  The subsequent logic doesn’t work well when more than **ONE** pdf filename was returned – i.e. it caused this error when it tried to execute the cp (copy command) and provided too many parameters for this command. 

 

To avoid this problem in the future, Facilities will need to send only one .xml file to be processed via the nightly KFSXFPDV_FP_DV_BATCHBFS_UPLOAD chain.  If a backlog of .xml file(s) needs to be processed, we could possibly run this chain multiple times during the daytime to catch up.  Please share with Facilities staff this restriction – otherwise, this error will continue to occur and **NOTHING** will get captured to VistaPlus from the loadDisbursementVouchersStep java program.  The KFSXFPDV chain does have the CHAIN_CANCEL feature, so it will not cause a stall within the KFSX schedule.

 

I have manually copied the two pdf output reports from last night’s KFSXFPDV to the /ais01/spool/vplus/out directory (with unique names) and also reconstructed the “chain vplus” output files within the /ais01/spool/vplus/temp/KFSXFPDV_tempdir/ directory.  I also manually created entries in the /ais01/spool/vplus/out/KFSXFPDV_VPLUS_DRIVER for capturing the two pdf reports.  This manual activity was intermingled with running a subset of the KFSXFPDV chain --- with manual creation of files occurring after the CHAIN_INIT component, but before the CHAIN_VPLUS component.  All reports have now been successfully captured to VistaPlus.  BTW- this is a somewhat painful process to manually create these files, so please steer Facilities into the only **ONE** feeder file per night routine!

Janice.

 

This job will runs nightly around 10 pm, so we won’t skip any files. The abort was related to putting the job output into our VistaPlus system. I don’t believe that anyone is checking VistaPlus for the summary report of the DVs uploaded (please correct me if I am wrong). So worst case scenario, the DVs load, but VistaPlus does not have the updated job log/summary report. We’ll keep an eye on this to determine if we need an alternate solution.

John.

 

 

 

 

 

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

08/11/11     Thu        06:03          See follow up below.

 

Error log and follow up comments:

 

 

               at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java:679)

at edu.csu.batch.service.impl.BatchRunnerServiceImpl.runJob(BatchRunnerServiceImpl.java:75)

               at edu.csu.batch.service.RunBatch.main(RunBatch.java:67)

2011-08-11 06:10:11,721 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXAPEI.electronicInvoiceExtractStep.6743617.6743620.00 steps: [electronicInvoiceExtractStep]

2011-08-11 06:10:11,721 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

RunBatch ERROR: Exception found:

 

Check KFSXAPEI log as Appman log above does not pinpoint where the error is

cd /ais02/log

ls –al KFSXAPEI*

Select the aborted log (should be the latest one out there).

 

2011-08-11 11:08:42,892 [main] INFO  org.kuali.kfs.module.purap.service.impl.ElectronicInvoiceHelperServiceImpl ::

 Saving Invoice Reject for DUNS '606788404'

2011-08-11 11:08:42,892 [main] INFO  org.kuali.rice.kns.document.DocumentBase :: invoking rules engine on document

 1453289

2011-08-11 11:08:42,908 [main] INFO  org.kuali.rice.kns.document.DocumentBase :: [document.invoiceRejectItems[0].i

nvoiceItemCatalogNumber] error.format.org.kuali.rice.kns.datadictionary.validation.charlevel.AnyCharacterValidatio

nPattern(Invoice Catalog Number (Catalog Number))

2011-08-11 11:08:51,968 [main] ERROR org.kuali.kfs.sys.batch.Job :: Exception occured executing step

org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

        at org.kuali.rice.kns.document.DocumentBase.validateBusinessRules(DocumentBase.java:581)

        at org.kuali.rice.kns.service.impl.DocumentServiceImpl.validateAndPersistDocument(DocumentServiceImpl.java

:679)

 

This shows the failed document  1453289 where KFS technical staff went to and discovered a space character and then corrected the file.

 

We restarted the ABORTED KFSXAPEI.KFSX_JAVA_01 step and the chain finished successfully.

Dermot.

 

I don't know what business rule failed but I think I know the eInvoice file that cause the job to stop:

/ais02/app/kfs/prd/work/staging/purap/electronicInvoice/606788404omax1_32681AUG1011_1313042073898565326.xml      

is the offending file that cause the job to abort.  If we move that file out of there we can then rerun the job for the rest of the files while I try and figure what is wrong with this single eInvoice file.

Matt.

 

 

 

 

 

Aborted Module Name:   AGENAM99.WAIT_FOR_CHAINS_01

  Date:        Day:      Time:          Resolution:

08/13/11     Sat         03:40           See follow up below.

 

Error log and follow up comments:

 

 

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

Could not open server pipe.

 

I found past references of this error in our "Abort log" document that indicated we restarted the component.  I did this, and now AGENAM99.WAIT_FOR_CHAINS_01 is running and waiting for AGENDYGN_DAILY_GENERAL, which is being held up by the abort within HRMSSERP_UPDT_ELEMENT_MEDICARE.

Steve.

 

 

Aborted Module Name:   AROSFRQ1.AROS-PYMTS-LOOP_01

 

  Date:        Day:      Time:          Resolution:

08/13/11     Sat         10:15           See follow up below.

 

Error log and follow up comments:

 

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

Could not open server pipe.

 

I found past references of this error in our "Abort log" document that indicated we restarted the component.  I did this, and now AROSFRQ1.AROS-PYMTS-LOOP_01 is running again and has spawned another AROSFRQ1.AROS_PYMTS_01.

Steve.

 

 

 

 

Aborted Module Name:   AREGDYCR.AREGS304_01

 

  Date:        Day:      Time:          Resolution:

08/16/11     Tue         05:11           See follow up below.

 

Error log and follow up comments:

 

05:11:24 SQL> start AREGS304

SP2-0310: unable to open file "AREGS304.sql"

05:11:24 SQL> 05:11:24 SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options

The sql is not in the directory:

 BROWSE --- /ais01/src/sql/prod/* ------------------------------ INVALID COMMAND

 COMMAND ===>                                                  SCROLL ===> CSR

     NAME                                          SIZE     DATE     TIME   ATTR

    AREGS303.sql                                   21512  11/04/04  10:29     

    AREGS303.sql.prev1                        21334  11/02/02  08:38    

    AREGS303.sql.prev2                        21261  11/01/25  11:31 

    AREGS400.sql                                   13002  07/04/03  13:22

Since this is a new sql, could we please make it follow standards before we place it into production?

 

Just a quick review reveals the following items which need changing:

1)  The obsolete format of the modlog block should not be included:

--* DATE       INIT SSR # REASON FOR THE CHANGE                        *       

--* ---------- ---- ----- -------------------------------------------- *       

--* mm/dd/ccyy--xx--Tnnnnn--Description-last line of modlog-don't delet

2)  Comments that are not relevant to this sql should not be included:

--* Note:                                                              *       

--*        Portions of code are commented out with commenting being    *       

--*        removed by sed command depending on whether the run is for  *       

--*        current fiscal year or staffing (future FY).                *       

--*            --*cur_fy is removed for current FY run                 *       

--*--------------------------------------------------------------------*       

3)  set verify on

Should be included in the group of "set" statements at the beginning of the sql

4)  The standard exit statement and comment block should be included at the end of the sql:

exit;                                                                          

--*--------------------------------------------------------------------*       

--*                                                                    *       

--* END OF AREGS304.SQL                                                *       

--*                                                                    *       

--*--------------------------------------------------------------------*

Please make these corrections ASAP so that the "standards compliant" version can be run when restarting the failed component.

As a reminder, there are sample sqls in /ais01/src/sql/updt on Kebler, demonstrating the sql standards:

/ais01/src/sql/updt/sql_appworx_plsql_sample

/ais01/src/sql/updt/sql_appworx_spool_sample

Oh.. one more thing.  Does the title reflect the author of this sql?

Janice.

 

 

Aborted Module Name:   HRMSS228.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

08/19/11     Fri         21:44           Deleted by Robin.

 

Error log and follow up comments:

 

 

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT

# > -rw-rw----    1 appworx  Gftp          13585 Aug 19 21:43 /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT

# > sftp> -ls -l /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls" not found # > sftp> put /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > Uploading /ais01/ftp/to/user/HRMSS228.HARTFORD_EOI.DAT to /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

# > sftp> ls -l /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

19 21:45:22-Parent: (2)Checking child process(2564492)

19 21:45:22-Parent: Child process[2564492] found

19 21:45:22-Parent: Checking child mem

19 21:45:22-Parent: Value in mem [N]

19 21:45:22-Looking for [/appworx/run/kill.6788301.00]

19 21:45:22-No Kill File found('/appworx/run/kill.6788301.00').

19 21:45:22-Parent: sleeping for 10 seconds.

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls" not found # > (1) #==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

I checked on the vendor web site and the destination file

  /DROP.CO.St.Univ.EOI.001/ColoStat_OnlineEOI.xls

does not exist.  This can happen if the vendor picks up the file and deletes it before we get a chance to confirm it is in place.  We probably will want to confirm with the vendor whether or not they received the file (13585 bytes).

*  If not, then we should be able to restart the SFTP component.

*  If they did, then we may need to change an option for the SFTP to not fail if the file is not found after we upload it.

Elden.

 

I have sent an email to Leanne at Hartford asking her to verify that they got the file.  I'll keep everyone posted with her response.

Leanne reporting back that they did receive the file and all looks good.

-Bob-

 

 

 

 

Aborted Module Name: KFSXFPPD.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

08/22/11     Mon      14:02            See follow up below.

 

Error log and follow up comments:

 

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

Caused by:

Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

+ print *** \n*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT \n***

 

DV 1466687 is the bad DV. The attached update sql will remove the bad character. The bad character was in position 193. So Josh, looking at the first 90 chars missed this.

Shawn will need to run the update, then Dermot can run the job.

This can wait till morning if necessary.

Thanks for everyone’s help.

John. W.

Attachment below:

set dv_chk_stub_txt =

'Dr. Browning Participant Support 8.22.11 -Dr. Browning needs to compensate 2 subjects that had to drop out of the Reebok Research study due to injury and illness -Compensation for Dr. Brownings research participants. To be paid in cash to research participants, NOT income to Dr. Browning. PAY BY CHECK. Log sheet to be returned to A/P upon completion of Disbursement Voucher. Questions? Please contact Dr. Browning at 491-5868.'

where fdoc_nbr = '1466687'

 

For future reference, any aborts within the KFSXPD_DY_PDP_DAILY_CHECK_ACH chain or its sub-chains (such as this one in KFSXFPPD_FP_DV_PREDISB_EXTR) should be resolved during working hours – and not be allowed to carry over to the next working day.

The BFS users expect for the morning ach/check cycle to run, starting at 7:00 A.M. – today’s chain is currently in SELF-WAIT status, waiting for yesterday’s chains to complete.  It does not make sense to complete yesterday’s chain now, but that (unless someone caught it, the WAIT_FOR_TIME_03 component (after the aborted job) in yesterday’s chain would wait until 2:30 – assuming we fixed the aborted job and let the chain continue.  That would mean that today’s chain would also not start running until sometime after 2:30 P.M.

To clear out yesterday’s chain and allow today’s chain to proceed, I deleted the following from backlog within yesterday’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain:

FSXPDCH_PDP_CHECKS_EXTR2 sub-chain

KFSXPD_DY.WAIT_FOR_TIME_03  component

KFSXPDDR_PDP_DAILY_RPT2 sub-chain

And the aborted KFSXFPPD.KFSX_JAVA_01 component

This will allow the CHAIN_FINISH of the KFSXFPPD_FP_DV_PREDISB_EXTR sub-chain to complete, as well as the KFSXPD_DY.CHAIN_SUMMARY_01 (to provided daily ach/check summary from yesterday) and the KFSXPD_DY_CHAIN_FINISH_01.  Of course, this will also allow today’s “SELF-WAIT”  KFSXPD_DY_PDP_DAILY_CHECK_ACH chain to proceed.  I suspect that we will see the same error in today’s  KFSXFPPD.KFSX_JAVA_01 component, as it appears to be data related.  It appears that it was restarted several times yesterday, with the same results (error) – so unless the data has changed, it will likely fail on the same error in today’s  KFSXFPPD.KFSX_JAVA_01 component.  Should that be the case, it is important that the BFS users/KFSX Team work together to fix the error as soon as possible to allow this morning’s ach/check cycle to proceed.

Janice.

 

 

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

08/23/11     Tue        07:24            Restarted by Dermot.

 

Error log and follow up comments:

 

 (PMT_NTE_ID,CUST_NTE_LN_NBR,CUST_NTE_TXT,LST_UPDT_TS,VER_NBR,PMT_DTL_ID,OBJ_ID) VALUES (?,?,?,?,?,?,?) '

* Exception message is [ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

* Vendor error code [12899]

* SQL state code [72000]

* Target class is 'org.kuali.kfs.pdp.businessobject.PaymentNoteText'

* PK of the target object is [id=10334944]

* Source object: paymentNoteText(id)=(10334944)

* The root stack trace is -->

* java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

              

This error will need to be fixed as soon as possible in order to minimize delay for this morning’s ach/check cycle.

Janice.

 

Last night, John found a solution for this problem.

As soon as a DBA applies the fix to the data, we will be ready to release it.

Can we just kill last night’s run and let today’s pick it up?

Last night we were under the impression that letting it go until this morning would not have a negative impact.

I have modified the logic to find special characters to search for the entire string rather than the first 90 characters.

I have added it to the resolution spreadsheet.

Josh.

 

See my earlier email (from around 7:15 this morning) – I already did the clean-up of yesterday’s stalled KFSXPD_DY_PDP_DAILY_CHECK_ACH by deleting the failed component and some of the downstream components.  If a problem such as this cannot be fixed in a timely manner during working hours, then the appropriate course of action would be to delete downstream components… allowing only the CHAIN_SUMMARY and CHAIN_FINISH components to run, thereby completing that day’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain.  It doesn’t make sense to wait until the next day to finish out the previous day’s check cycle because, for one, the users expect an ach/check cycle to run in the morning – not just the leftover “checks only” cycle from previous afternoon.   Additionally, taking the time to complete previous day’s KFSXPD_ DY_PDP_DAILY_CHECK_ACH chain would delay current day’s KFSXDY_PDP_DAILY_CHECK_ACH chain – perhaps until after 2:30 P.M. if the various WAIT_FOR_TIME components within the previous day’s chain are not forced to run “before their time”.  But more importantly, it is just cleaner to start over with the new day – it contains the ach/check cycle  so it just makes more sense to start over with that cycle and stay on track, timing-wise.  The only danger would be if the BFS users had actually performed a FORMAT CHECKS yesterday (but I checked and no check xml files are present in   /ais02/app/kfs/prd/work/staging/pdp/paymentExtract).  They usually don’t perform FORMAT CHECKS until after receiving the A.M./P.M. KFSXPDDR report which lets them know what’s available for payment – and we never got to that subchain of the P.M. cycle yesterday.            

 

If the problem is data related and/or requires a dba fix, then it will surface again in the next day’s KFSXPD_DY_PDP_DAILY_CHECK_ACH chain (the morning cycle) – at which time the fix can hopefully be implemented quickly enough so that the users morning ach/check cycle will not be delayed too long.

I suggest that you place the KFSXPD_DY.WAIT_FOR_TIME_01 component on hold and have someone manually verify with BFS that they have had time to perform the manual ‘FORMAT CHECKS’ process.  They usually would have already received the report from KFSXPDDR  (around 7:30)… so if that is delayed too long, they may not have time to review the report and perform ‘FORMAT CHECKS’ process prior to the 09:00 A.M. start time for the    KFSXPD_DY.WAIT_FOR_TIME_01 component.  They may not even have any  ach/checks for today – that does happen sometimes.  In that case, the timing is not such a big deal.

Once the KFSXPDDR sub-chain completes **AND** you’ve received confirmation from BFS that the FORMAT CHECKS has been completed (or they aren’t planning to do one this morning), then you may release the hold on the WAIT_FOR_TIME_01 component.  Do you know who will be contacting BFS regarding this?

I just checked and there is a check xml file in /ais02/app/kfs/prd/work/staging/pdp/paymentExtract, which was created 11/08/23  08:58.  I’d say we are “good to go” and could release the hold on WAIT_FOR_TIME_01 component.

Janice.

 

 

Aborted Module Name:   DOITDEMO_01.FTPS_CURL_01

  Date:        Day:      Time:          Resolution:

08/31/12     Fri          20:49           Restarted by Steve.

12/01/12     Fri          11:37           See note from David below.

03/30/13     Sat          14:44           Restarted by Steve.

Error log and follow up comments:

 

08/31/12.

#==============================================================================

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# 2011.08.31-21:06:50  : RETURN CODE = 100 : /appworx/csu/exec/FTPS_CURL.PL #==============================================================================

error is 100

 

The log shows that the file was opened and couldn't be written to:

 

# > < 125-FTP Server unable to obtain EXCLUSIVE use of G.F.CSU.DOWN1 which is held by: 0093 DDWRKFRC EXCL on SYSDSN

# > < 125 Data set G.F.CSU.DOWN1 is not available

This probably can be tried again since no data was written, according to the log.  Please check with the process flow notes to verify if this can be restarted.

When it is restarted, if it fails again, then someone will have to follow-up with the state as to why the file is locked on their server.

Elden.

 

12/01/2012 11:37    DEPETERS

Noticed that DOITDEMO_01.FTPS_CURL_01 had failed. It could not find the source file. It looks like when the condition ran to copy the file it thought the utl_file was empty. I found the backup file and found that it did have data in it. Timing issue? I manually copied the backup file to the /ais01/ftp/to/user directory and restarted DOITDEMO_01.FTPS_CURL_01. It finished successfully.

 

03/30/2013 14:44    SSGREENE

DOITDEMO_01.FTPS_CURL_01 failed looking for file /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT

 

For some reason the utl file was never copied from HRMSDEMO_01.HRMSS051_01

to  /ais01/ftp/to/user/DOITDEMO_01.HRMSS051_01.DAT like it should have been.  I manually copied the backup of the utl file to /ais01/ftp/to/user, renamed it to DOITDEMO_01.HRMSS051_01.DAT and restarted the failed component, which finished successfully.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDSAIG_EV.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

09/01/11     Thu        00:05           See follow up below.

 

Error log and follow up comments:

 

 

+ print APPEND=Y

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ print AUTOEXT=N

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ print receiveclass=EXITFFOP)"

+ 1>> FAIDSAIG_EV_receive_cmdfile

+ read this_receive_tdclient

+ tdclient_out=/ais01/dat/work/prod/FAIDSAIG_EV.TDCLIENT_01.non_isir

+ tdclientc cmdfile=FAIDSAIG_EV_receive_cmdfile

+ 1> /ais01/dat/work/prod/FAIDSAIG_EV.TDCLIENT_01.non_isir 2>& 1

+ exit 107

 

I officially "hate" TDCLIENT!!!  I had to build a temp version of the TDCLIENT.KSH to bypass the portion of the job which had already successfully completed and pick up where it left off.  Why is it that if SAIG is going to not communicate with us -- it had to do that **between** collection of the ISIR and non-ISIR messages class files??  Just to make it more difficult for us! 

I think we should have a follow-up Clarity incident to re-examine the TDCLIENT processing.  Maybe we could separate the ISIR and non-ISIR processing into separate run types - thereby making restarts easier in situation as we faced this morning.

Cross your fingers that in the "temp" script I removed only what should have been removed!! 

Janice.

 

 

 

Aborted Module Name:   KFSXCS52.KFSXS007_01

  Date:        Day:      Time:          Resolution:

09/02/11      Fri          03:07           See follow up below.

 

Error log and follow up comments:

 

Terminated Employee: Duncan,Karen Lee                44472   klduncan         INACTIVE                         Employee

Terminated Employee: Edler,Joshua Robert      43384   jredler          INACTIVE                         CSU Ex-Employee

Terminated Employee: Edwards,Ryan J           28464   edwa3314         INACTIVE                         Employee

Terminated Employee: Eisenhauer,Scarlett Frederike 52183   seisenha           INACTIVE                         CSU Ex-Employee

declare

*

ERROR at line 1:

ORA-20100: Too many Users are being terminated.  Verify the HR data.

ORA-06512: at line 355

 

There were over 100 people that were set to be terminated in KFS today.

Since that broke our threshold, the job failed.

Josh.

 

We have set up a catch in KFS that if over 100 people terminate then the Kuali user update is canceled.  A couple of weeks ago this job tried to terminate thousands of people so we built this safety guard.

Last night the system sent over 100 people to terminate.  Can you have someone verify that this is correct?

Theresa.

 

Yes, that sounds reasonable.  Three processes run on the first of each month to automatically terminate 3 groups of assignments:

-          Those with an Appt End Date more than 3 months past

-          Those with an I-9 which expired more than 3 months past

-          Students and non-student hourlies who haven’t been paid in the last 18 months

If you want to send me the full list when you have it, I’m happy to do a little more checking.

Carolee.

 

I will have the job restarted. Do you want me to set the values to 500 everyday or just for today’s run?

Josh.

 

Let’s bump for every day- we are just trying to catch something overly excessive.  It sounds like it will not be uncommon for it to be at least this  large once a month.

Theresa.

 

The max termination was increased to allow 500 terminations opposed to 100.

This flag is used as a safety measure.

This process worked as expected by raising the error.  Users needed to verify the data and there were no technical problems.

HR reviewed the data and was comfortable with the number of terminations.

We have permanently increased this threshold since there are months when there could be a lot of activity.

Josh.

 

 

 

 

 

Aborted Module Name:   EIDSUPDT.HRMSS111_01

  Date:        Day:      Time:          Resolution:

09/06/11      Mon       22:39           Restarted by Joleen.

 

Error log and follow up comments:

 

+ print *** \n*** END SEARCH OF LOG FOR SQL ERROR STRINGS \n***

+ 1>> /ais01/dat/work/prod/EIDSUPDT.HRMSS111_01.6868218.6868225.00.2011_09_06_2239_sql_followup

+ cat /ais01/dat/work/prod/EIDSUPDT.HRMSS111_01.6868218.6868225.00.2011_09_06_2239_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

829180059 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

The problem is in the CSUH_EMAIL_UPDATE function.  There are two records in the per_all_people_f table that match the CSU ID (829180059).  It looks like it's two different people, but with the same name.  The CSU ID (attribute 12) needs to be changed for one of them.

-Bob-

 

Who can/should make this change?

The person in Banner that I associate with this ID has an HR ID of H52492.  Hope that makes sense.

Vicki.

 

 

 

 

Aborted Module Name:  AREGORGN.AREGS002_01

 

  Date:        Day:      Time:          Resolution:

09/07/11      Tue        17:02           See follow up below.

 

Error log and follow up comments:

 

 

line=201210 BC   487A  FNS2

line=201210 BC   487B  FNS2

line=201210 BC   499A  FNS2

line=201210 BC   499B  FNS2

line=201210 BC   711A  FNS2

line=201210 BC   711C  FNS2

line=201210 BC   711D  FNS2

line=201210 BC   711F  FNS2

line=201210 BIOM 476A  FEG3

line=201210 BIOM 476B  FEG3

begin <<all_block>>

*

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 332

 

Hi Denise and Jerry

AREGS002 is aborting because it has more than 200 error messages.  Most of the error messages look like the messages below.  What do you want to do?  Do you want to skip the AREGS002 module and work at completing your AREG schedule?  Or do you want to do something else?

 

Attribute FBU3 Already Exists

201210 ACT  679A  FBU3

 

Attribute FAG1 Already Exists

201210 AGRI 496A  FAG1

Vicki.

 

Hi Vicki, I put that file out yesterday.  Let's skip the AREGS002 and I'll look at it again, but let's get the AREG schedule completed.

Sorry for the delay, Jerry & I were in a meeting.

Denise.

 

The AREGS002 module has been skipped and the rest of the AREG schedule should start running.

Vicki.

 

 

 

Aborted Module Name:   AROSFRQ1.TGRAPPL_01

  Date:        Day:      Time:          Resolution:

09/12/11      Mon      21:15           See note from Janice below.

 

Error log and follow up comments:

 

I happened to log in tonight and noticed the AROSFRQ1.TGRAPPL and AROSDPA1.TGRAPPL failures.  Normally, we would not have AROSFRQ1 running same time as AROSDPA1 because we create the file to stop the AROSFRQ1 cycles before AROSDPA1 starts.  However, this was an earlier AROSFRQ1.TGRAPPL which failed with resources deadlock:

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 682

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1519

ORA-06512: at line 1

 

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line

3,395

WRN-ERRSTMT: Following statement was last statement parsed:

    begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N tgrappl terminated with error

 

Once a TGRAPPL fails, then any subsequent TGRAPPL's (like tonight's AROSDPA1.TGRAPPL) will fail with:

*                   **WARNING**                         *

*  You cannot submit this job - it is already running.  *

*                                                       *

*  You will also get this message if a previous run of  *

*  this program aborted.  If this is the case, the      *

*  control record for that run must be deleted before   *

*  proceeding. (GJBPRUN record for this jobname with    *

*  a -1 one-up-no).

 

So, since I think the AROSFRQ1.TGRAPPL didn't really get off the ground due to the resources deadlock, I'm going to delete the aforementioned GJBPRUN record and try to resubmit AROSFRQ1.TGRAPPL.  If that completes okay, then I'll restart AROSDPA1.TGRAPPL.

 

09/12/2011 21:35    JMWILKIN

That worked -- AROSFRQ1.TGRAPPL output seems normal.  I waited until the remainder of the AROSFRQ1 cycle finished, then restarted the failed AROSDPA1.TGRAPPL.

By the way, as followup, we need to figure out what "deadlocked" with the AROSFRQ1.TGRAPPL.

AROSDTRN.TGRCLOS_01 was running at the same time - would these two programs fight over resources??

 

 

 

 

Aborted Module Name:   AROSDPA1.TGRAPPL_01

  Date:        Day:      Time:          Resolution:

09/12/11      Mon      21:15           See note from Janice below.

 

Error log and follow up comments:

 

 

I happened to log in tonight and noticed the AROSFRQ1.TGRAPPL and AROSDPA1.TGRAPPL failures.  Normally, we would not have AROSFRQ1 running same time as AROSDPA1 because we create the file to stop the AROSFRQ1 cycles before AROSDPA1 starts.  However, this was an earlier AROSFRQ1.TGRAPPL which failed with resources deadlock:

RUN SEQUENCE NUMBER:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 682

ORA-06512: at "BANINST1.TB_RECEIVABLE", line 1519

ORA-06512: at line 1

 

WRN-ORACERR: Error occurred in file "tgrappl.pc" at line

3,395

WRN-ERRSTMT: Following statement was last statement parsed:

    begin tb_receivable . p_update ( p_PIDM => :ap_request_pidm , p_TRAN_N tgrappl terminated with error

 

Once a TGRAPPL fails, then any subsequent TGRAPPL's (like tonight's AROSDPA1.TGRAPPL) will fail with:

*                   **WARNING**                         *

*  You cannot submit this job - it is already running.  *

*                                                       *

*  You will also get this message if a previous run of  *

*  this program aborted.  If this is the case, the      *

*  control record for that run must be deleted before   *

*  proceeding. (GJBPRUN record for this jobname with    *

*  a -1 one-up-no).

 

So, since I think the AROSFRQ1.TGRAPPL didn't really get off the ground due to the resources deadlock, I'm going to delete the aforementioned GJBPRUN record and try to resubmit AROSFRQ1.TGRAPPL.  If that completes okay, then I'll restart AROSDPA1.TGRAPPL.

 

09/12/2011 21:35    JMWILKIN

That worked -- AROSFRQ1.TGRAPPL output seems normal.  I waited until the remainder of the AROSFRQ1 cycle finished, then restarted the failed AROSDPA1.TGRAPPL.

By the way, as followup, we need to figure out what "deadlocked" with the AROSFRQ1.TGRAPPL.

AROSDTRN.TGRCLOS_01 was running at the same time - would these two programs fight over resources??

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

09/15/11      Thu       07:04           Restarted by Dermot.

 

Error log and follow up comments:

 

 

2011-09-15 07:17:13,180 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

RunBatch ERROR: Exception found:

org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

Caused by: java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 91, maximum: 90)

            at java.lang.Throwable.<init>(Throwable.java:67)

 

We have not received this morning’s check format and our Kuali people are out this morning.  Is there a problem.

Thanks,

Jackie..

 

Yes, the pre-disbursements extract program, disbursementVoucherPreDisbursementProcessorExtractStep, failed.  The notification regarding this failure was sent early this morning to the IS Kuali Team, but so far we have not heard anything back so I’m guessing that they are still working to resolve the problem.

Of course, this program is early in the daily ach/check process and we need to solve this problem before the checks will be produced.

I’ve placed a hold on the portion of the chain which would normally take off at 09:00 because you may not receive the report by then and/or have time to do the Format Checks.

I’ve included the IS Kuali Team on this email traffic, so hopefully they can update all of us with progress on solving the problem.

Janice.

 

We are in the process of updating the data to correct the special characters. We should have this resolved shortly.

Josh.

 

 

 

 

 

Aborted Module Name:   AROSSTM1.AROSS302_01

  Date:        Day:      Time:          Resolution:

09/15/11      Thu       23:20          Restarted by Joleen.

 

Error log and follow up comments:

 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-00060: deadlock detected while waiting for resource

ORA-06512: at "ODSMGR.TOKODST", line 48

ORA-00001: unique constraint (TAISMGR.TBRCCHG_INDEX_01) violated

ORA-06512: at "TAISMGR.TT_TBBACCT_INSERT_ODS_CHANGE", line 8

ORA-04088: error during execution of trigger

ORA-06512: at line 436

ORA-06512: at line 1086

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

It looks like the error was spawned from a deadlock.

Please restart the module.

Josh.

 

 

 

 

Aborted Module Name:   ODSRAROS.ODSRS001_01

 

  Date:        Day:      Time:          Resolution:

09/15/11      Thu       23:26          Restarted by Joleen.

 

Error log and follow up comments:

 

old   6:     csug_ods_refresh.log_begin_time('&REFRESH_APP');

new   6:     csug_ods_refresh.log_begin_time('REFRESH_AR');

old   8:     ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new   8:     ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_AR', NULL, '');

old  14:     csug_ods_refresh.log_end_time('&REFRESH_APP');

new  14:     csug_ods_refresh.log_end_time('REFRESH_AR');

begin

ERROR at line 1:

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

 

Here are the errors I found in REFRESH_AR. These two mappings had errors, they did finish successfully (highlighted numbers) but the IA Admin tool had trouble verifying this and considered them failed.

We have two option, we can re-run the Refresh AR to get the correct end date/timestamp (30-40 mins) or we can continue on with whatever follows this component.

 

 

 

 

OWNER

ID

MAP

ELT

RS

SEL

INS

DEL

START_TIME

END_TIME

JOBPRD

459500

DELETE_MTT_ACCOUNT

00: 01: 08

COMPLETE

130670

0

130307

9/15/2011 23:26

9/15/2011 23:27

JOBPRD

459511

UPDATE_MTT_ACCOUNT

00: 12: 44

COMPLETE

130322

130322

0

9/15/2011 23:33

9/15/2011 23:46

- Mark

 

 

 

 

Aborted Module Name:   HRMSACH_QPS.NACHA_01  

  Date:        Day:      Time:          Resolution:

09/16/11      Fri         08:12          See follow up below.

 

Error log and follow up comments:

 

HRMSACH_QPS.NACHA_01 / HRMSACH_NACHA_PROCESSING is in FIN-DB ERROR.

 

2011-09-16 08:10:49 Prompt 22 changed from "{#HRMS_{#2}_PAYROLL_TYPE}" to "{#HRMS_QPS_PAYROLL_TYPE}" by OSU=appworx JDBC Thin Client

2011-09-16 08:10:49 Prompt 24 changed from "{#HRMS_{#3}_PAYDATE_HRFORMAT}" to "{#HRMS_QUICK_PAYDATE_HRFORMAT}" by OSU=appworx JDBC Thin Client

2011-09-16 08:10:49 Prompt 25 changed from "{#HRMS_{#3}_PAYDATE_HRFORMAT}" to "{#HRMS_QUICK_PAYDATE_HRFORMAT}" by OSU=appworx JDBC Thin Client

CON-2011-09-16 08:12:01 Set Subvar

CON-2011-09-16 08:12:04 Set Subvar

CON-2011-09-16 08:12:07 Set Subvar

CON-2011-09-16 08:12:09 Set Subvar

2011-09-16 08:12:09 Prompt 22 changed from "{#HRMS_QPS_PAYROLL_TYPE}" to "21" by OSU=appworx JDBC Thin Client

2011-09-16 08:12:09 Prompt 24 changed from "{#HRMS_QUICK_PAYDATE_HRFORMAT}" to "2011/09/15 00:00:00" by OSU=appworx JDBC Thin Client

2011-09-16 08:12:10 Prompt 25 changed from "{#HRMS_QUICK_PAYDATE_HRFORMAT}" to "2011/09/15 00:00:00" by OSU=appworx JDBC Thin Client

2011-09-16 08:13:20 java.sql.SQLException: ORA-12899: value too large for column "APPWORX"."SO_JOB_QUEUE"."SO_LOG_REVIEWED" (actual: 1792, maximum: 1)

ORA-06512: at "APPWORX.AW5", line 2464

ORA-06512: at line 1

aw5.aw_condition_action

                0 jobid: IN:NUMERIC:java.math.BigDecimal:6917597

                1 condition_order: IN:NUMERIC:java.math.BigDecimal:10

                2 action: IN:VARCHAR2:java.lang.String:SET SUBVAR

                3 performed: IN:OUT:VARCHAR2:java.lang.String:N

                4 actionArg: IN:VARCHAR2:java.lang.String:#HRMSACH_QPS_PAYROLL_ACTION_ID=33226831

                5 results: OUT:NUMERIC::null

                6 text: OUT:VARCHAR2::null

FIN-DB ERROR(FINISHED) 2011-09-16 08:13:20

2011-09-16 08:47:07

 

I have reviewed the output for request 6896549 and everything seems to be in order.  It processed 4 employees successfully.

Can you provide more information pertaining to the abort you are seeing?

Steve H.

 

An after condition had failed:

HRMSACH_QPS.NACHA_01  appman after condition variables were manually updated and the component deleted so HRMSACH_NACHA_PROCESSING could continue.

David.

 

 

 

 

Aborted Module Name:  KFSXCS52.KFSXS007_01

  Date:        Day:      Time:          Resolution:

01/19/12      Thu       08:57           Restarted by Dermot.

 

Error log and follow up comments:

 

PLS-00201: identifier 'CSUF_EMPLOYEE_PRIMARY' must be declared

ORA-06550: line 119, column 28:

PL/SQL: Item ignored

ORA-06550: line 120, column 28:

PLS-00352: Unable to access another database 'KRTEST@KRUSER'

ORA-06550: line 120, column 28:

PLS-00201: identifier 'CSUF_EMPLOYEE_PRIMARY' must be declared

ORA-06550: line 120, column 28:

 

Was this running on production?

I am a little concerned that any production job would be referencing KRTEST.

Josh.

 

I saw the same thing – and yes, this is running on AWPROD.  The KFSXS007 sql runs using jobprd@kfsprd login, so we’re definitely running this sql against kfsprd.

Looks like CSUF_EMPLOYEE_PRIMARY, mentioned in the error message, is a view that selects from csuf_employee_primary@kfs_to_ods

It becomes difficult to follow the tracks across links, across views and so on, but it appears that something (maybe in odsprod) is pointing to krtest?

Janice.

 

I have found the problem, looks like a synonym is incorrect.

I will work with a DBA to get the problem resolved and let scheduling know when we are ready to restart the module.

Josh.

 

 

 

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

09/22/11      Thu       06:01          See follow up below.

 

Error log and follow up comments:

 

KFSXAPEI.KFSX_JAVA_01 / KFSXAPEI.KFSX_JAVA_01 is in ABORTED status.

 

If the error is not obvious in this outpur file then try searching for INVALID to locate the correct error below in yellow.

 

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

***

+ egrep ^log4j:|^WARNING:|^ERROR:|^Exception|^Caused by: /ais02/log/KFSXAPEI.KFSX_JAVA_01.6944212.6944215.00.2011_09_22_0601.log

log4j:WARN File option not set for appender [LogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:WARN File option not set for appender [MemoryLogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:ERROR No output stream or file set for the appender named [LogFile].

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

+ print *** \n*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT \n***

***

*** COPY END OF REMOTE SHELL LOG TO STD OUTPUT

 

2011-09-22 06:06:10,842 [main] INFO  org.kuali.kfs.module.purap.service.impl.ElectronicInvoiceOrderHolder :: Adding reject reason - Invoice Purchase Order Number is an Invalid Number (Invoice Order ID:280373 REPLACE)

 

To locate the log file go to /ais02/log

To locate the xml file (see below) go to /ais02/app/kfs/prd/work/staging/purap/electronicInvoice

 

<InvoiceDetailOrderInfo>                                                                                        

<OrderReference orderID='280373 REPLACE' orderDate='2011-09-21'><DocumentReference payloadID='280373 REPLACE'></Do

</OrderReference>                                                                                                

<SupplierOrderInfo orderID='48941309'></SupplierOrderInfo>                                                      

</InvoiceDetailOrderInfo>                                                                                        

<InvoiceDetailItem invoiceLineNumber='1' quantity='1'><UnitOfMeasure>CS</UnitOfMeasure>                         

<UnitPrice>     

 

 There should be no space between  '280373 REPLACE'    

John inserted a _ between '280373_ REPLACE' and job was restarted.    

Dermot.                                                                      

 

 

 

Aborted Module Name:  FAIDDLDR_EV.RERIM12_04

  Date:        Day:      Time:          Resolution:

09/30/11      Fri         00:13          See follow up below.

 

Error log and follow up comments:

 

 

+ grep 6983812

      6983812.00 BANNER    FAIDDLDR_EV.RERIM12_09/30 00:16 00:00:02 ABORTED                FAIDDLDR_DISB_REQUIREMENT

+ print Failure in spawned RERIM12 - abort this module

Failure in spawned RERIM12 - abort this module

+ exit 1

 

1 row created.

 

Elapsed: 00:00:00.01

                'crpn12op.2011_09_29_1010.bak.xml'

                *

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

 

As I was working on the KFSX job I noticed that

FAIDDLDR_EV.RERIM-LOOP_01 failed in spawned

FAIDDLDR_EV.RERIM12_04 with the following error:

 

Elapsed: 00:00:00.01

                'crpn12op.2011_09_29_1010.bak.xml'

                *

ERROR at line 15:

ORA-12899: value too large for

column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

 

I decided to remove this offending file since it looked like an exact duplicate of crpn12op.xml as shown below. I copied the file with the name too long to my directory and re-started the looper so that the Finaid schedule would not hang all night.

 

283 Kebler finaid% ls -l crpn*

-rw-rw----    1 appworx  Gprd          95370 Sep 30 00:06

crpn12op.2011_09_29_1010.bak.xml

-rw-rw----    1 appworx  Gprd          95370 Sep 30 00:06

crpn12op.xml

284 Kebler finaid% cp crpn12op.2011_09_29_1010.bak.xml ~dpeterso

David.

 

 

 

Aborted Module Name:  ODSRAROS.ODSRS001_01

  Date:        Day:      Time:          Resolution:

10/17/11     Mon       23:25          Restarted by Joleen.

 

Error log and follow up comments:

23:25:46  11          if (error_ind <> 'N') then

23:25:46  12            raise_application_error(-20001,'ODS Refresh Failed');

23:25:46  13          end if;

23:25:46  14          csug_ods_refresh.log_end_time('&REFRESH_APP');

23:25:46  15        end;

 

old   6:     csug_ods_refresh.log_begin_time('&REFRESH_APP');

new   6:     csug_ods_refresh.log_begin_time('REFRESH_AR');

old   8:     ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new   8:     ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_AR', NULL, '');

old  14:     csug_ods_refresh.log_end_time('&REFRESH_APP');

new  14:     csug_ods_refresh.log_end_time('REFRESH_AR');

 

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

The error can be ignored.  There is a bug in the IA Admin interface that pops up once in a while where it can’t verify if a mapping has completed, but the OWB audit log does show it is complete.

 

ls: 0653-341 The file /orautl/odsprod/ODSRAROS.ODSRS001_01.utl_file* does not exist.

 

 

 

 

MAP

ELT

RS

SEL

INS

DEL

START_TIME

END_TIME

UPDATE_MTT_ACCOUNT_DETAIL

00: 01: 56

COMPLETE

78525

78525

0

10/17/2011 23:45

10/17/2011 23:47

UPDATE_MTT_ACCOUNT

00: 11: 52

COMPLETE

131041

131041

0

10/17/2011 23:33

10/17/2011 23:45

DELETE_MTT_ACCOUNT_DETAIL

00: 04: 55

COMPLETE

78525

0

76266

10/17/2011 23:26

10/17/2011 23:31

DELETE_MTT_ACCOUNT

00: 00: 55

COMPLETE

131754

0

131004

10/17/2011 23:25

10/17/2011 23:26

 

 

Aborted Module Name:   FAIDCFIM_FA.SWPCOFI_01

  Date:        Day:      Time:          Resolution:

11/10/11     Thu        07:01          Restarted by Joleen.

10/11/12     Thu        06:59          Restarted by Joleen.

 

Error log and follow up comments:

 

11/10/11.     

ABORT: data file record number 1 contains incorrect number of fields

 

Import Error and Warning Legend

--------------------------------

IMP-001: File ID could not be match to SPBPERS, GOBINTL, or SWRSDET. Record not loaded.

IMP-002: Matching SWRSDET record does not exists for batch. Record not loaded

IMP-003: File birth date does not match SPBPERS_BIRTH_DATE.

IMP-004: Student name does not match SPRIDEN.

 

I've copied the COF file to your secure directory /userfiles/Ufaid/data/FAIDCFIM_FA.DECRYPT_01.7187999.DAT

Appears the first record has both header and data values - thus can't match.

Is this something you can have COF set the corrected file back out, or should we try to fix?

Phil.

 

10/11/12.

ORA-03135: connection lost contact

Process ID: 0

Session ID: 0 Serial number: 0

 

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus Now turn on set -x for debug purposes

+ [ -f login.11131303 ]

+ echo Could not log in to SQL*Plus.

Could not log in to SQL*Plus.

+ echo Exiting with error (return code = 5).

 

I checked the jobprd@banprod login. Test was successful. Regarding the job it did not get started. No processing has occurred for it. Even prompt value insertion into table gjbprun did not get completed successfully. Given no other conditions to worry about I would just reset the failed component.

Gudrun.

 

There were no conditions on the component. I restarted and the component has finished running.

Joleen.

 

 

 

 

Aborted Module Name:  HRMSS241.SSH_SFTP_04

  Date:        Day:      Time:          Resolution:

11/23/11     Wed       20:58          See note from Janice below.

04/28/14     Mon       07:20          Restarted by Robin.

 

Error log and follow up comments:

 

11/23/11.

# > Local working directory: /ais101jfs/jobprd # > sftp> lls -l {#ENCRYPT_DEST_FILE_7258991 # > ls: 0653-341 The file {#ENCRYPT_DEST_FILE_7258991 does not exist.

# > Shell exited with status 2

# > sftp> -ls -l /HRMSS241_NEWHIRE.pgp

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/HRMSS241_NEWHIRE.pgp" not found # > sftp> put {#ENCRYPT_DEST_FILE_7258991 /HRMSS241_NEWHIRE.pgp # > File "{#ENCRYPT_DEST_FILE_7258991" not found.

# > (1)

#==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

 

When the SSH_SFTP_04 component was added last week, the associated prompt #1 value was missing the trailing } character.

I modified the chain definition, changing {#ENCRYPT_DEST_FILE_{chain_id} to {#ENCRYPT_DEST_FILE_{chain_id}}.

Likewise, for the aborted component, I added the ending } character, changing {#ENCRYPT_DEST_FILE_7258991 to {#ENCRYPT_DEST_FILE_7258991} and restarted the failed component.  It completed successfully.

As follow-up, might be worthwhile to double-check the other chains that transfer files to HealthSmart to verify that all newly added SSH_SFTP component(s) have the source filename prompt value specified properly - {#ENCRYPT_DEST_FILE_{chain_id}}

Janice.

 

04/28/14.

# > secureftp.healthsmart.comPermission denied (publickey).

 

Looks like something has changed at Healthsmart. Who do we contact to get the new login credentials?

Steve G.

 

Did the subvar for the identity file get changed ? It seems to be set to a value other than expected.

Please check values against successful run 13020083 and retry.

Gudrun.

 

Shouldn't the idfile = "/home/jobprd/.ssh/csu_to_health_smart-4096-20111027"

The current value for the idfiles is "cstate@secureftp.healthsmart.com:/HRMSS241_NEWHIRE.pgp"

Please try to change the idfile to "/home/jobprd/.ssh/csu_to_health_smart-4096-20111027"

Also appman variable, #ssh_idfile_jobprd_healthsmart needs to be set to /home/jobprd/.ssh/csu_to_health_smart-4096-20111027

David.

 

I have made the corrections and HRMSS241_COBRA is now complete.

Robin.

 

 

 

 

 

 

 

 

Aborted Module Name:   AREGORLA.AREGS519_01

  Date:        Day:      Time:          Resolution:

12/13/11     Tue        07:00           Restarted by Joleen.

 

Error log and follow up comments:

 

 

There is a missing file AREGORLA.AREGS519_01.DAT in the userfiles/Umath directory.

 

I’ll follow up with Lois Samer to see what she has to say.

Vicki.

 

Vicki and I talked. I have removed AREGORLA_ONREQ_LAST_ATTENDED from backlog. I will request this process flow back in when the user has the file available.

Joleen.

 

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

01/24/12     Tue         14:33           See below.

 

Error log and follow up comments:

 

 

Java(TM) SE Runtime Environment (build pap6460sr5-20090529_04(SR5))

IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 AIX ppc64-64 jvmap6460sr5-20090519_35743 (JIT enabled, AOT enabled)

J9VM - 20090519_035743_BHdSMr

JIT  - r9_20090518_2017

GC   - 20090417_AA)

JCL  - 20090529_01

<#/ais02/job/temp/kfsx_java_ssh.ksh.110#> java -Xms1g -Xmx1g -classpath /opt/freeware/apache-tomcat-kfsprd/webapps/kfs-prd/WEB-INF/classes:/opt/freeware/apache-tomcat-kfsprd/common/lib/*:/opt/freeware/apache-tomcat-kfsprd/webapps/kfs-prd/WEB-INF/lib/*:. edu.csu.batch.service.RunBatch disbursementVoucherPreDisbursementProcessorExtractStep KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.7529604.7529659.00

+ print *** \n*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT \n***

***

*** log4j:, WARNING:, ERROR:, Exception or Caused by MESSAGES TO STD OUTPUT

***

+ egrep ^log4j:|^WARNING:|^ERROR:|^Exception|^Caused by: /ais02/log/KFSXFPPD.KFSX_JAVA_01.7529604.7529659.00.2012_01_24_1403.log

log4j:WARN File option not set for appender [LogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:WARN File option not set for appender [MemoryLogFile].

log4j:WARN Are you using FileAppender instead of ConsoleAppender?

log4j:ERROR No output stream or file set for the appender named [LogFile].

WARNING: Prefs file removed in background /home/appworx/.java/.userPrefs/prefs.xml

WARNING: Prefs file removed in background /etc/.java/.systemPrefs/prefs.xml

Exception in thread "Thread-3" java.lang.OutOfMemoryError

 

We need to increase the memory from 1 Gb to 2Gb. Anyone know how to do that?

John.

 

I changed the prompt value in the ABORTED job to 2g (prompt #5) and restarted the chain.

After the chain completed I also updated prompt #5 at the chain level within KFSXFPPD so that it will in future run with the value of 2g.

Dermot.

 

 

Aborted Module Name:  FAIDTKNT_EV.LYNX_01

  Date:        Day:      Time:          Resolution:

01/13/12     Fri         02:20           Restarted by Steve.

 

Error log and follow up comments:

01/13/12.    

 

Looking up wsprod.colostate.edu

Making HTTP connection to wsprod.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsprod.colostate.edu

Making HTTP connection to wsprod.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Alert!: Unexpected network read error; connection aborted.

Can't Access `http://wsprod.colostate.edu/cwis231/autorun/parent_tknt_email.cfm?ay=FAIDTKNT_EV'

Alert!: Unable to access document.

 

lynx: Can't access startfile

 

I tried pinging wsprod.colostate.edu and got a response.  I then reset FAIDTKNT_EV.LYNX_01 and it finished. 

 FAIDTKNT_TRACK_NOTIFICATION is proceeding.

Steve.

 

    

 

 

 

Aborted Module Name:   FAIDCFIM_SP.SSH_SFTP_LIST_01

 

  Date:        Day:      Time:          Resolution:

01/19/12     Thu         06:59           See follow up below.

Error log and follow up comments:

 

 

# - sftp

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_infosys_prod"  cofcsu@ftp.college-assist.org

# > Welcome to COFsftp> pwd

# > Remote working directory: /

# > sftp> lpwd

# > Local working directory: /ais101jfs/jobprd # > sftp> ls -1 resp_query/FAIDCFEX_SP.2012_01_19_0600.gpg.resp.gpg

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/resp_query/FAIDCFEX_SP.2012_01_19_0600.gpg.resp.gpg" not found # > (1) #==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

# > (0)

# > (100)

#==============================================================================

# FATAL : Command failed with code : 100

 

It looks like the file (FAIDCFEX_SP.2012_01_19_0600.gpg) from COF this morning was empty. This caused our FAIDCFIM_SP failure. Can you check with them regarding this?

David.

 

 

 

 

 

 

Aborted Module Name:   HRMSREC_SAL.SQLLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

01/19/12     Thu         10:15           See follow up below.

 

Error log and follow up comments:

 

 

Record 25: Rejected - Error on table "CSUH"."CSUH_CAMPUS_REC_TRANS_00", column EE_CONTRIBUTION.

ORA-01722: invalid number

 

Record 26: Rejected - Error on table "CSUH"."CSUH_CAMPUS_REC_TRANS_00", column EE_CONTRIBUTION.

ORA-01722: invalid number

 

MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run.

 

Table "CSUH"."CSUH_CAMPUS_REC_TRANS_00":

  3 Rows successfully loaded.

  100 Rows not loaded due to data errors.

  0 Rows not loaded because all WHEN clauses were failed.

  0 Rows not loaded because all fields were null.

pace allocated for bind array:                  99072 bytes(64 rows)

Read   buffer bytes: 1048576

 

 

Please restart this program.  The data had "$" on the amounts so it was failing.  I fixed the data so we should be good to go.

Steve H.

 

 

 

Aborted Module Name:   EIDSUPDT.HRMSS111_01

  Date:        Day:      Time:          Resolution:

01/20/12     Fri         23:34           Restarted by Joleen..

 

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

821140538 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

The problem is that the CSUH_EMAIL_UPDATE function, when called with int_ref_id = 821140538, is returning multiple rows.

I just checked... In the function there is a select statement and sure enough it returns 2 rows (in HRPROD, not HRTEST).  The one record has a last name of "Calhoun", while the last name on the other one is "Calhoun delete".

-Bob-

 

Carolee

Can you please look into this and see if you can resolve this data issue in HRPROD so that we can proceed with/finish the EIDS schedule?

Vicki.

 

I'm not sure what needs to be done immediately. I'm waiting for the "good" record to be approved so we can merge any eID and ARIES records. Then I'll delete the other record. Should I delete the email from the good record as a fix for today?

Carolee.

 

Carolee,

The only way to fix this issue now is to either delete the record, change the attribute12 field (CSU ID), or change the effective_end_date (currently 31-dec-4712').

-Bob-

 

I deleted the CSU ID.

Carolee.

 

 

 

Aborted Module Name:   HRMSDED_SAL.HRMSRPTS-LOOP_02

 

  Date:        Day:      Time:          Resolution:

01/24/12     Tue         19:25           See follow up below.

 

Error log and follow up comments:

 

p_salary_start='21-DEC-2011'

p_salary_end='24-JAN-2012'

element1='Campus Recreation'

element2='Campus Recreation Dues'

log_file='No'

------------

Execution options

VERSION=2.03b ORIENTATION=LANDSCAPE

Current NLS_LANG and NLS_NUMERIC_CHARACTERS Environment Variables are :

American_America.US7ASCII

 

Enter Password:

REP-0091: Invalid value for parameter 'ELEMENT1'.

 

Phase 4 of Salary processing, including the email to the listserv about Salary Payroll being done is waiting for HRMSDED_SAL to complete.

The newly added HRMSR317 report failed (see Robin's email below).  This was tested on AWTEST, but I notice that the HRMSR317 job definition on AWPROD has different prompt default values than on AWTEST, which I'm suspecting may be causing the failure?

On AWPROD, the following values were passed to HRMSR317:

element1='Campus Recreation'

element2='Campus Recreation Dues'

log_file='No'

but on AWTEST, the values used were:

element1='5207'                                                                 

element2='5255'                                                                

log_file='N'                                                                   

 

What values should be used on AWPROD?

Janice.

 

You are correct about the parameters.

Bev.

 

Robin,

Please make the parameter changes to the HRMSR317 job (module) definition on AWPROD and then reset the HRMSDED_SAL.HRMSRPTS-LOOP_02 component.  It keeps track of which reports it has completed, so should pick up with HRMSR317, which by the way is the last report that needs to run.

Oh, and this failure is also holding up the capture of the following Salary reports to VistaPlus:

HRMSR002 HRMSR003 HRMSR005 HRMSR040 HRMSR041 HRMSR042

HRMSR043,CSU_FAC_FLEX_COMBINED,

HRMSR043,CSU_ST_FLEX_COMBINED,

HRMSR240, HRMSR315, HRMSR316,

So all of these, along with the HRMSR317, will be captured as soon as we have a successful completion on HRMSR317.

Janice.

 

 

 

Aborted Module Name:   FAIDALEX_EV.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

01/24/12     Tue         18:02           See follow up below.

 

 

Error log and follow up comments:

 

 

 

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

Joleen.

 

 

I just tried manually logging into the server and it worked this time.

/usr/bin/sftp -oIdentityFile="/home/jobprd/.ssh/csu_to_elmnet"  SCH05FO@ftp.elmproduction.com Connecting to ftp.elmproduction.com...

sftp> exit

You should be able to follow the appropriate restart instructions with any required approval.

Elden.

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

02/06/12     Wed         14:00           Decrease the employee Customer Name field.

 

Error log and follow up comments:

 

2013-02-06 17:26:37,358 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

 

The DV_CNTCT_PRSN_NM was Stipend for participation in Upward Bound program for Block 1 Sep12-Dec12 970-491-3551, this concated with “Info:” at the end created a string that was longer than 90 characters and caused the error.  This particular error was not caused by a special character has seen in above entries.

 

The resolution was to shorten the person nm by 2 characters.  In this case the area code for the phone number was removed.

 

Documentation will be created to check for those scenarios and are listed below:

All of these have the potential of causing the note to be over 90 characters.

pnt.setCustomerNoteText("Info: " + document.getDisbVchrContactPersonName() + " " + document.getDisbVchrContactPhoneNumber());

 

pnt.setCustomerNoteText("Send Check To: " + dvSpecialHandlingPersonName);

pnt.setCustomerNoteText(dvSpecialHandlingLine1Address);

pnt.setCustomerNoteText(dvSpecialHandlingLine2Address);

pnt.setCustomerNoteText(dvSpecialHandlingCity + ", " + dvSpecialHandlingState + " " + dvSpecialHandlingZip);

pnt.setCustomerNoteText("Attachment Included");

pnt.setCustomerNoteText("Reimbursement associated with " + dvnet.getDisbVchrServicePerformedDesc());

pnt.setCustomerNoteText("The total per diem amount for your daily expenses is " + dvnet.getDisbVchrPerdiemCalculatedAmt());

pnt.setCustomerNoteText("The total dollar amount for your vehicle mileage is " + dvnet.getDisbVchrPersonalCarAmount());

pnt.setCustomerNoteText(exp.getDisbVchrExpenseCompanyName() + " " + exp.getDisbVchrExpenseAmount());

pnt.setCustomerNoteText("Payment is for the following individuals/charges:");

pnt.setCustomerNoteText(dvpcr.getDvConferenceRegistrantName() + " " + dvpcr.getDisbVchrExpenseAmount());

 

Execute the below script to search for bad characters and long customer notes for the names and phone numbers.

G:\DOC\KFS\Production_Fixes\Production Recovery\KFSXFPPD_find.sql

Josh.

 

 

Aborted Module Name:   KFSXTXW2.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

02/01/12     Wed       10:03           See follow up below.

 

Error log and follow up comments:

 

# FATAL : Error opening file (/ais01/bkp/KFSXTXW2.HRMSS244_01.RPT) : A file or directory in the path name does not exist.

 

The HRMSS244 step was skipped because the java program did **NOT** create a 1099 file, which must be used as the input to HRMSS244.    I guess we’ve never had this situation occur before – nor, apparently, did we expect it because there is no logic to stop the SEND_MAIL component from running in this situation.  Ironically, we do have logic in place to skip the CHAIN_SQL_INIT and HRMSS244 components if no 1099 file – just didn’t have that logic on the SEND_MAIL?    SEND_MAIL is failing because it doesn’t find the  “report” that would have been created from HRMSS244, which in this case didn’t even run!

Solution will require determining why java program, electronicFilingStep, created no output file?  And then – rerun KFSXTXW2 from the beginning .

If it helps, there was an exceptions file created from electronicFilingStep – attached to this email and also available in Kebler file: /ais01/bkp/KFSXTXW2.7571154.1099_exc.csv

Janice.

 

Yes, this csv file contains a critical error message indicating that a business validation failed which has prevented the file from being created. This is a vital piece of information.

I’d like to challenge Gudrun/Dermot/Steve to remember this for next year.

I am working with BFS to fix the problem. It looks like it will require a Java coding change that cannot take place until next Wednesday. Can someone please cancel/delete this job flow/chain until we are ready?

John.

 

I’d like to challenge Dermot/Steve to be sure this is documented in KFSXTXW2 Process Flow notes.  Even better…  maybe we should create a follow-up Incident, whereby the process flow is modified to send out an alternate email, to which the exceptions file would be attached and indicate that the 1099 file did not get created.  This would actually be easy to do if we just added some BEFORE conditions to SEND_MAIL component to set value in a new chain specific subvar, #which_rpt_{chain_id} as follows:

When  {#kfsx_1099_{chain_id}} = Y, then set  #which_rpt_{chain_id}={#bkp}/{#1}.HRMSS244_01.RPT

When {#kfsx_1099_{chain_id}} != Y, then set  #which_rpt_{chain_id}={#mailst}/SEND_MAIL.KFSXTXW2.NO_1099.TXT

Then use this new subvar {#which_rpt_{chain_id}} as the value for the SEND_MAIL prompt #12, rather than the current value of {#bkp}/{#1}.HRMSS244_01.RPT

Create /ais01/dat/misc/mailst/SEND_MAIL.KFSXTXW2.NO_1099.TXT  file to contain something like this:

***ERROR*** electronicFilingStep did NOT create a 1099 file.

See attached exceptions file for possible reasons that 1099 file creation did not occur.

The solution described above would be cleaner – SEND_MAIL component would not fail due to missing {#bkp}/{#1}.HRMSS244_01.RPT file  **and** hopefully would get troubleshooting going in the right direction via the examiniation of the exceptions file that will be attached to email.

Anyone care to pursue this alternative?

 If we hurry to implement above, we’d have the perfect testing opportunity in production right now.

Janice.

 

Completed Clarity Task (T08160).

Dermot.

 

 

 

Aborted Module Name:  KFSXCS53.SSH_EXEC_01

  Date:        Day:      Time:          Resolution:

02/04/12     Sat         00:12           See note from Janice below.

 

Error log and follow up comments:

 

02/04/2012 11:33    JMWILKIN

Just taking a quick look at Appman - making sure some recent Faid process flow renames working okay. 

Faid schedule had already completed, but I did notice the KFSXCS53.SSH_EXEC_01 failure.  KFSXCS53_CSU_LOAD_ID_ATTACH process flow runs weekly on Friday. 

Error indicates that appworx user is not authorized to execute SSH_EXEC script.

Sure enough, between the previous Friday's run and yesterday's run, the following entry had been removed from the authoriz.list file:

appworx /appworx/csu/exec/SSH_EXEC.PL 1

I put this entry back into the authoriz.lis file and restarted the component, which has now successfully completed.  KFSXAM99 "schedule done" should now be able to proceed once the remaining components of KFSXCS53 have completed.

 

 

 

 

Aborted Module Name:   AREGORGN.AREGS411_01

  Date:        Day:      Time:          Resolution:

03/04/13     Mon        17:01           Restated by Joleen.

09/25/13     Wed        09:38           Restated by Steve.

 

Error log and follow up comments:

 

17:01:15 295         v_api_count := v_api_del_count + v_api_add_count;

17:01:15 296 

17:01:15 297         if v_api_count > 200 then

17:01:15 298             raise_application_error(-20500,'Error count Exceeded 200');

17:01:15 299         end if;

17:01:15 300 

 

Problem Inserting the Alt Pin record for:

A,201390,821045156,ADVR,841804

error is: ORA-20100: ::Cannot create, record already exists::

 

This error appears 201 times in the utl file for 201 different records

 

Sue fixed the input file. I restarted AREGORGN.AREGS411. It has finished running.

Joleen.

 

09/25/13.

09:38:12 294 

09:38:12 295         v_api_count := v_api_del_count + v_api_add_count;

09:38:12 296 

09:38:12 297         if v_api_count > 200 then

09:38:12 298             raise_application_error(-20500,'Error count Exceeded 200');

09:38:12 299         end if;

                                        

I’ve ftp’d a new file.  Hope this one works.

Can you try the AREGORGN/AREGS411 again.

Denise.

 

 

 

 

Aborted Module Name:  KFSXFPPC.KFSX_JAVA_04  

  Date:        Day:      Time:          Resolution:

02/29/12     Wed       19:16           Restarted by Dermot.

07/18/12     Wed       19:18           Restarted by Dermot.

 

Error log and follow up comments:

 

 

02/29/12.

at org.springmodules.orm.ojb.PersistenceBrokerTemplate.execute(PersistenceBrokerTemplate.java:141)

                at org.springmodules.orm.ojb.PersistenceBrokerTemplate.store(PersistenceBrokerTemplate.java)

                ... 131 more

2012-02-29 19:21:34,975 [main] INFO  edu.csu.batch.service.RunBatch :: Finished executing job: KFSXFPPC.procurementCardRouteDocumentsStep.7723638.7723649.00 steps: [procurementCardRouteDocumentsStep]

2012-02-29 19:21:34,975 [main] INFO  edu.csu.batch.service.RunBatch :: RunBatch ERROR: Exception (nested) org.kuali.rice.kew.exception.WorkflowRuntimeException: java.lang.RuntimeException: post processor caught exception while handling route status change: OJB operation; uncategorized SQLException for SQL []; SQL state [61000]; error code [60]; ORA-00060: deadlock detected while waiting for resource

; nested exception is java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource

RunBatch ERROR: Exception found:

 

This step ABORTED due to both KFSXFPPC and KFSXFPAA trying to access the same table at precisely the same time.

Dermot.

 

Please restart this module.

Josh.

 

07/18/12.

2012-07-18 19:18:48,031 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springf

ramework.dao.DataAccessResourceFailureException: Could not open OJB PersistenceBroker; nested exception is org.apa

che.ojb.broker.PBFactoryException: Transaction synchronization failed - wrong status of external JTA tx. Expected

was an 'active' or 'no transaction', found status is 'STATUS_MARKED_ROLLBACK'

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

 

As per instructions from John, the job was restarted and completed successfully.

Dermot.

 

 

 

Aborted Module Name:   KFSXCS14.KFSXS011_01

  Date:        Day:      Time:          Resolution:

03/01/12     Thu        07:45            Restarted by Dermot.

 

 

Error log and follow up comments:

 

ORA-20001: Error in KFSX011.sql: -20001 -ERROR- ORA-20001: Error in

KFSX011.sql: -60 -ERROR- ORA-00060: deadlock detected while waiting for

resource

ORA-06512: at line 537

 

07:45:25 519       IF fringe_error_count > 1 THEN

07:45:25 520           DBMS_OUTPUT.PUT_LINE ('-');

07:45:25 521           DBMS_OUTPUT.PUT_LINE ('WARNING '|| to_char(fringe_error_count) || ' Fringe Distribution errors found');

07:45:25 522           DBMS_OUTPUT.PUT_LINE ('-');

07:45:25 523       END IF;

07:45:25 524       DBMS_OUTPUT.PUT_LINE ('Total GL count: '                           || to_char(tot_gl_count) );

07:45:25 525       DBMS_OUTPUT.PUT_LINE ('Total Fringe count: '                                    || to_char(tot_fringe_count) );

07:45:25 526       DBMS_OUTPUT.PUT_LINE ('Total Cap Const CSU cash split: '     || to_char(csu_cash_count) );

07:45:25 527       DBMS_OUTPUT.PUT_LINE ('Total Cap Const State cash split: '   || to_char(state_cash_count) );

07:45:25 528       DBMS_OUTPUT.PUT_LINE ('Total Cap Const Fed cash split: '     || to_char(fed_cash_count) );

07:45:25 529       DBMS_OUTPUT.PUT_LINE ('Total Staging Records Created count: '|| to_char(stage_count) );

07:45:25 530       DBMS_OUTPUT.PUT_LINE

07:45:25 531             ('**** End   of KFSXS011 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

07:45:25 532 

07:45:25 533  EXCEPTION

07:45:25 534  WHEN OTHERS THEN

07:45:25 535        DBMS_OUTPUT.PUT_LINE

07:45:25 536             ('**** Exception Encountered on gl_entry_t: '||current_gl_rec_string|| ' ERROR: '||SQLERRM);

07:45:25 537        raise_application_error(-20001,'Error in KFSX011.sql: '||current_gl_rec_string|| SQLCODE ||

07:45:25 538           ' -ERROR- '||SQLERRM);

07:45:25 539 

07:45:25 540  END;

 

Please restart the aborted job.

Josh.

 

There was a conflict between 2 scripts accessing the same table at the same time which caused a “deadlock detected while waiting for resource” error and caused KFSXCS14.KFSXS011_01 to ABORT, the chain which was in contention with this script was KFSXCS12.KFSXS011_01.

Chain was restarted and completed successfully. If this ABORT happens again then we may need to add a dependency between these chains to prevent another reoccurrence.

Dermot.

 

 

 

 

Aborted Module Name:   KFSXBMAI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

03/01/12     Thu        19:22           Files resubmitted from RMS System.

 

 

Error log and follow up comments:

 

org.kuali.kfs.sys.exception.ParseException: error Parsing error was encountered on line 15, column 47: cvc-datatype-valid.1.2.1: '53    0' is not a valid value for 'integer'.

 

Hi Ron/Tyler,

Last night the BMP account field failed.  It appears that there was a bad account in one of the xml files.

I have attached xml of the bad account.   I changed the file extension so it wouldn’t get pick off in the email filter.

Account number “53    0” was not a valid integer.

This entry was found in file:

KFSXBMAI.UresspC2012-03-01.xml

Do you want to resubmit the file tonight?

Let me know how you would like to proceed or if I can provide additional information.

diagnostic line number is 15.

Josh.

 

We have backup of the files and will resubmit.

 just to be clear, we’ll correct and resubmit from our backup files – no need for you to retrieve the xml.

Ron.

 

We do not restart. We let it auto cancel, and Ron/Tyler will resubmit the files from their RMS system. So we merely report the problem to them.

John.

 

How to locate error in file.

Check Appman Chain “KFSXBMAI /COLLECT_FILES” which is the step before the ABORTED Java step.

Prompt 1 will lead you to the subvar #kfsxprod}/{module}.DRIVER.DAT now click on the subvar icon “#” & type kfsxprod & this will indicate the kebler location: /ais01/dat/kfsx/prod

Then on Kebler,

cd /ais01/dat/kfsx/prod

ls KFSXBM*

KFSXBMAI.COLLECT_FILES_01.DRIVER.DAT

cd /userfiles/Uressp/data

$ ls KFSXB*

ls: 0653-341 The file KFSXB* does not exist. If the file does not exist in this directory then go to the bkp directory.

$ cd /ais01/bkp

KFSXBMAI.UresspS2012-03-01.xml.2012_03_01_1922.bak

KFSXBMAI.UresspS2012-03-01.xml.TEMP.2012_03_01_1921.bak

 

   <chartOfAccountsCode>CO</chartOfAccountsCode>

        <accountNumber>53    0</accountNumber>

        <campusCode>MC</campusCode>

        <accountName>Dummy Account - do not use </accountName>

 

 

Aborted Module Name:  AREGTTRN_TOD_TRANSCRIPT  

  Date:        Day:      Time:          Resolution:

03/09/12     Fri          03:22           See follow up below.

 

Error log and follow up comments:

 

 

Here's the SFTP command that failed and the reason "Hostname and service name not provided or found" which likely indicates a problem with DNS lookup or a general network problem somewhere between CSU and iwantmytranscript.com.

 

...

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  colora-88@iwantmytranscript.com

4 01:20:35-Parent: (1)Checking child process(2822572)

...

4 01:22:45-No Kill File found('/appworx/run/kill.7746386.00').

4 01:22:45-Parent: sleeping for 10 seconds.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found^M

# > Connection closed^M

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

...

Elden.

 

All,

It's important that some analysis occur prior to restarting any aborted components within the various Transcript process flows.  We may wish to consider documenting, within the Abort Documentation, some of the errors which have been encountered (such as described in Elden's email below) to serve as a guide for handling similar aborts in the future.  I believe the consensus is that such failures to connect, as shown below, would in general be "safe" to restart.  Obviously, failures which occur during an SSH transfer (when a connection has been established) would require additional investigation to determine if any transfers/partial transfers had occurred and/or if any cleanup activity would need to occur prior to restarting an aborted component.

If Abort documentation is to be maintained via the AREGTTRN_TOD_TRANSCRIPT sub-process flow, then it probably would be a good idea to have a reminder within Abort Documentation for the main process flows, AREGDYTS_TRANSCRIPTS and AREGFQTR_SEND_TRANSCRIPTS, to refer to the AREGTTRN_TOD_TRANSCRIPT sub-process flow Abort Documentation.

Janice.

 

 

 

 

Aborted Module Name:   KFSXTXW2.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

02/01/12     Wed        10:03          See follow up below.

 

Error log and follow up comments:

 

KFSXTXW2.SEND_MAIL_01 /  KFSXTXW2_TAX_W2_AND_1099_RPT ABORTED.

 

# FATAL : Error opening file (/ais01/bkp/KFSXTXW2.HRMSS244_01.RPT) : A file or directory in the path name does not exist.

 

The HRMSS244 step was skipped because the java program did **NOT** create a 1099 file, which must be used as the input to HRMSS244.   

I guess we’ve never had this situation occur before – nor, apparently, did we expect it because there is no logic to stop the SEND_MAIL component from running in this situation.  Ironically, we do have logic in place to skip the CHAIN_SQL_INIT and HRMSS244 components if no 1099 file – just didn’t have that logic on the SEND_MAIL?    SEND_MAIL is failing because it doesn’t find the  “report” that would have been created from HRMSS244, which in this case didn’t even run!

Solution will require determining why java program, electronicFilingStep, created no output file?  And then – rerun KFSXTXW2 from the beginning .

If it helps, there was an exceptions file created from electronicFilingStep – attached to this email and also available in Kebler file: /ais01/bkp/KFSXTXW2.7571154.1099_exc.csv

Janice.

 

Yes, this csv file contains a critical error message indicating that a business validation failed which has prevented the file from being created. This is a vital piece of information.

I’d like to challenge Gudrun/Dermot/Steve to remember this for next year.

I am working with BFS to fix the problem. It looks like it will require a Java coding change that cannot take place until next Wednesday. Can someone please cancel/delete this job flow/chain until we are ready?

John.

 

I’d like to challenge Dermot/Steve to be sure this is documented in KFSXTXW2 Process Flow notes.  Even better…  maybe we should create a follow-up Incident, whereby the process flow is modified to send out an alternate email, to which the exceptions file would be attached and indicate that the 1099 file did not get created.  This would actually be easy to do if we just added some BEFORE conditions to SEND_MAIL component to set value in a new chain specific subvar, #which_rpt_{chain_id} as follows:

When  {#kfsx_1099_{chain_id}} = Y, then set  #which_rpt_{chain_id}={#bkp}/{#1}.HRMSS244_01.RPT

When {#kfsx_1099_{chain_id}} != Y, then set  #which_rpt_{chain_id}={#mailst}/SEND_MAIL.KFSXTXW2.NO_1099.TXT

Then use this new subvar {#which_rpt_{chain_id}} as the value for the SEND_MAIL prompt #12, rather than the current value of {#bkp}/{#1}.HRMSS244_01.RPT

Create /ais01/dat/misc/mailst/SEND_MAIL.KFSXTXW2.NO_1099.TXT  file to contain something like this:

***ERROR*** electronicFilingStep did NOT create a 1099 file.

See attached exceptions file for possible reasons that 1099 file creation did not occur.

The solution described above would be cleaner – SEND_MAIL component would not fail due to missing {#bkp}/{#1}.HRMSS244_01.RPT file  **and** hopefully would get troubleshooting going in the right direction via the examiniation of the exceptions file that will be attached to email.

Anyone care to pursue this alternative?

 If we hurry to implement above, we’d have the perfect testing opportunity in production right now.

Janice.

 

Completed Clarity Task (T08160).

Dermot.

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   Transcript Connection Failure

  Date:        Day:      Time:          Resolution:

03/05/12     Mon        10:03          See follow up below.

 

Error log and follow up comments:

 

Here's the SFTP command that failed and the reason "Hostname and service name not provided or found" which likely indicates a problem with DNS lookup or a general network problem somewhere between CSU and iwantmytranscript.com.

 

...

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  colora-88@iwantmytranscript.com

4 01:20:35-Parent: (1)Checking child process(2822572)

...

4 01:22:45-No Kill File found('/appworx/run/kill.7746386.00').

4 01:22:45-Parent: sleeping for 10 seconds.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found^M

# > Connection closed^M

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

Elden.

 

All,

It's important that some analysis occur prior to restarting any aborted components within the various Transcript process flows.  We may wish to consider documenting, within the Abort Documentation, some of the errors which have been encountered (such as described in Elden's email below) to serve as a guide for handling similar aborts in the future.  I believe the consensus is that such failures to connect, as shown below, would in general be "safe" to restart.  Obviously, failures which occur during an SSH transfer (when a connection has been established) would require additional investigation to determine if any transfers/partial transfers had occurred and/or if any cleanup activity would need to occur prior to restarting an aborted component.

If Abort documentation is to be maintained via the AREGTTRN_TOD_TRANSCRIPT sub-process flow, then it probably would be a good idea to have a reminder within Abort Documentation for the main process flows, AREGDYTS_TRANSCRIPTS and AREGFQTR_SEND_TRANSCRIPTS, to refer to the AREGTTRN_TOD_TRANSCRIPT sub-process flow Abort Documentation.

Janice.

 

 

Aborted Module Name:   KFSXCS31.KFSX_JAVA_01

 

  Date:        Day:      Time:          Resolution:

03/21/11     Wed        07:32          Aborted step deleted to allow process flow to continue.

 

Error log and follow up comments:

 

 

2012-03-21 07:53:48,892 [main] INFO  org.kuali.rice.kew.docsearch.SearchableAttribute :: Indexing document 1764269 for document search...

2012-03-21 07:53:49,325 [main] INFO  org.kuali.rice.kns.util.MaintenanceUtils :: starting checkForLockingDocument (by MaintenanceDocument)

2012-03-21 07:53:49,330 [main] ERROR org.kuali.rice.kew.docsearch.SearchableAttribute

:: Encountered an error when attempting to index searchable attributes, requeuing.

java.lang.NumberFormatException: can't convert infinity or NaN

 

at java.math.BigDecimal.<init>(BigDecimal.java:574)

              at java.math.BigDecimal.<init>(BigDecimal.java:541)

              at org.kuali.rice.kns.util.AbstractKualiDecimal.<init>(AbstractKualiDecimal.java:61)

              at org.kuali.rice.kns.util.KualiDecimal.<init>(KualiDecimal.java:60)

              at org.kuali.kfs.module.cam.util.KualiDecimalUtils.safeMultiply(KualiDecimalUtils.java:148)

              at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.setupAsset(AssetGlobalServiceImpl.java:455)

              at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.getSeparateAssets(AssetGlobalServiceImpl.java:408)

              at org.kuali.kfs.module.cam.businessobject.AssetGlobal.generateGlobalChangesToPersist(AssetGlobal.java:676)

              at org.kuali.rice.kns.workflow.attribute.DataDictionarySearchableAttribute.findAllSearchableAttributesForGlobalBusinessObject(DataDictionarySearchableAttribute.java:394)

 

 

Document 1764269 is an Asset Global document.

I sent an email to Theresa/Debra who work with Assets.

The Abort log shows these lines:

 

at org.kuali.kfs.module.cam.util.KualiDecimalUtils.safeMultiply(KualiDecimalUtils.java:148)

at org.kuali.kfs.module.cam.document.service.impl.AssetGlobalServiceImpl.setupAsset(AssetGlobalServiceImpl.java:455)

 

When I checked java program AssetGlobalServiceImpl at line 455. I can see that it is trying to set the Salvage amount to a decimal. So therefore I am requesting that Theresa/Debra clean up the decimal field on the Asset Global. Hopefully this will be the solution, otherwise we will get the abort again tomorrow.

John.

 

 

Aborted Module Name:  CLMSACCT.CLMS_URL_EXEC_01

 

  Date:        Day:      Time:          Resolution:

03/23/12      Fri         17:31          See follow up below.

 

Error log and follow up comments:

 

 

StackTrace:

   at System.Net.HttpWebRequest.GetResponse()

   at ClmHttp.Post(String url, String post_data) in d:\Program Files\SCT\clm\Application\App_Code\ClmHttp.cs:line 41

   at ASP.interfaces_outbound_accounting_feed_extract_aspx.page_load() in d:\Program Files\SCT\clm\Application\interfaces\outbound_accounting_feed_extract.aspx:line 10

Message:

The remote server returned an error: (500) Internal Server Error.

==================================================================

=== lynx.stderr ====================================================

 

=== lynx.status ====================================================

   URL=http://clm.colostate.edu/clm/interfaces/outbound_accounting_feed_extract.aspx (GET)

STATUS=HTTP/1.1 200 OK

==================================================================

[100] : *** ERROR : Status (error) Returned ***

rm: Removing lynx.581856/lynx.status

rm: Removing lynx.581856/lynx.stderr

rm: Removing lynx.581856/lynx.stdout

rm: Removing directory lynx.581856

#== Exiting /appworx/csu/exec/CLMS_URL_EXEC.SH [prod acct_feed_extract ] (100) ============

+ err=100

 

I think there may have been some connectivity issues.  I connected to CLM manually this morning ok so hopefully we’re good now.  Can you restart the job?

Steven Dove.

 

I restarted the component and it aborted again. It looks like the same error as before. L

Joleen.

 

I think this job is failing because it isn’t finding any records to process.  I checked with Trish and she said it would be ok if we skip this job so our production schedule can continue.

Steven Dove.

 

 

 

 

 

Aborted Module Name:   HRMSDED_HRL.HRMSS061_11

  Date:        Day:      Time:          Resolution:

04/02/12      Mon      16:00          Deleted by Joleen.

Error log and follow up comments:

 

 

 

+ print *** \n*** HRMSR316 EXTRACT UNSUCCESSFUL - ABORT \n***

***

*** HRMSR316 EXTRACT UNSUCCESSFUL - ABORT

***

+ exit 100

The extract was probably unsuccessful because the report (HRMSR316) did not return any data.

Steve H.

 

 

We deleted HRMSDED_HRL.HRMSS061_11.

Joleen.

 

 

 

 

Aborted Module Name:   HRMSKFS_QPH.HRMSS175_01

  Date:        Day:      Time:          Resolution:

04/02/12      Mon       23:19          See note from Elden below.

 

Error log and follow up comments:

 

 

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_deltadentalco-4096-20100914"  CSU1@transfer.deltadentalco.com

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @

# > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

# > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

# > Someone could be eavesdropping on you right now (man-in-the-middle attack)!

# > It is also possible that the RSA host key has just been changed.

# > The fingerprint for the RSA key sent by the remote host is # > 55:03:77:0a:88:d1:92:87:7f:bb:d9:bb:1f:ca:eb:87.

# > Please contact your system administrator.

# > Add correct host key in /home/jobprd/.ssh/known_hosts to get rid of this message.

# > Offending key in /home/jobprd/.ssh/known_hosts:30 # > RSA host key for transfer.deltadentalco.com has changed and you have requested strict checking.

# > Host key verification failed.

# > Connection closed

# > (255)

 

 

I’m glad you deleted the failed SSH.  It looks like they changed their server or server key.  The error you’re seeing is a security safety mechanism to reduce the risk of a man-in-the-middle attack.  We should check with our HR team to verify if Delta Dental sent out a notice or if they did indeed change their server.

Once we are satisfied that this is a legitimate server for Delta Dental, then log into jobprd:

•     comment out (prefix the line with ‘## ‘) the current entry for Delta Dental from jobprd’s ~/.ssh/known_hosts file

o     change ‘transfer.deltadentalco.com,205.169.191.2…’ to ‘## transfer.deltadentalco.com,205.169.191.2’

•     manually connect to Delta Dental:

o     /usr/bin/sftp  -oIdentityFile="/home/jobprd/.ssh/csu_to_deltadentalco-4096-20100914"  CSU1@transfer.deltadentalco.com

•     accept the new server key from Delta Dental

Elden.

 

 

 

Aborted Module Name:   HRMSACH_QPS.HRMSR218_01

  Date:        Day:      Time:          Resolution:

04/03/12      Tue       08:18          See note from David below.

 

Error log and follow up comments:

 

 

HRMSACH_QPS.HRMSR218_01 has a DB ERROR

 

2012-04-03 08:13:49 Prompt 21 changed from "{#{#1}_{#2}_CONSUB_REQ_NO}" to "{#HRMSACH_QPS_CONSUB_REQ_NO}" by OSU=appworx JDBC Thin Client

2012-04-03 08:18:30 java.sql.SQLException: ORA-20025: No role access to "Agent as "AWPROD" rtype=N edit=N my_dba=N my useq=115*-5-6-9-11"

ORA-06512: at "APPWORX.AWAPI2", line 4436

ORA-06512: at "APPWORX.AWOP_API", line 675

ORA-06512: at "APPWORX.AW5", line 2467

ORA-06512: at line 1

aw5.aw_condition_action

               0 jobid: IN:NUMERIC:java.math.BigDecimal:7927155

               1 condition_order: IN:NUMERIC:java.math.BigDecimal:3

               2 action: IN:VARCHAR2:java.lang.String:REQUEST JOB

               3 performed: IN:OUT:VARCHAR2:java.lang.String:N

               4 actionArg: IN:VARCHAR2:java.lang.String:-m HRMSARC_PAYROLL_ARCHIVE -u APPWORX -o AWPROD -q HRMS -f STORE -arg HRMSARC QPS QUICK _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_ _NULL_

               5 results: OUT:NUMERIC::null

               6 text: OUT:VARCHAR2::null

 

The actual error was java.sql.SQLException: ORA-20025: No role access to "Agent as "AWPROD". Per Greg I retried again by re-setting the BEFORE condition to submit HRMSARC back to “ONCE” and then re-starting the failed component. It completed okay the second time.

David.

 

 

 

Aborted Module Name:   AREGTTRN.RWCLIENT_01

  Date:        Day:      Time:          Resolution:

04/10/12      Tue       07:09          See follow up below.

08/20/12      Mon      16:26          Mark. B restarted report server.

 

Error log and follow up comments:

 

04/10/12.

<<errtrap_ssh.6>> print *** \n*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED /ais02/log/AREGTTRN.RWCLIENT_01.7972115.7972129.01.2012_04_10_0709.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

 

We’ve contacted the oncall dba (Shawn) to request that the report server be recycled. 

Janice.

 

Does the report server need to be rebooted?

Vicki.

 

I’ve just restarted the report server. Please let me know if you encounter any further issues.

Shawn.

 

Just a FYI…I’m working on a way to detect this and automatically restart the report server.  Hopefully I’ll have that in very soon.

Mark B.

 

08/20/12.

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED /ais02/log/AREGTTRN.RWCLIENT_01.8821471.8821485.01.2012_08_20_1626.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

+ cat /ais02/log/AREGTTRN.RWCLIENT_01.8821471.SEND_MAIL_ERR.DAT

REP-0177: Error while running in remote server

Unable to connect to the specified database.

 

Mark B. restarted the report server and the job has completed successfully

David.

 

 

 

 

 

 

 

Aborted Module Name:   ADMSBDMS_APPL_RWCLIENT

  Date:        Day:      Time:          Resolution:

04/10/12      Tue       12:06          See follow up below.

 

Error log and follow up comments:

 

I see that this and ADMSR207, 209, 263, 269 all failed.  Can you let me know if someone is working on getting these resolved and restarted and if there is anything we need to do on our end.

Marcella.

 

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Tuesday, April 10, 2012 12:06 AM

To: ADM Systems; IS DL: Alert ADMS

Cc: ADM Systems; IS DL: Alert ADMS

Subject: ADMSR206 FAILED

 

*** Oracle Report:  ADMSR206

    Processing Failed -- Report was not successfully generated.

*** From Appworx Chain: ADMSBDMS_APPL_RWCLIENT

*** Oracle Instance: banprod

*** Report Parameters Used:

    LAST_REPORT_DATE=20120406221634

    SENSITIVE_INFO=YES

 

*** Report Errors:

    REP-0177: Error while running in remote server

    OCI_INVALID_HANDLE. ==> --using a union so that the print_sensitive_info parameter will only select the W5 applications

 

 

David has replaced the file and reran ADMSBDMS. You should have your reports now.

Joleen.

 

 

 

 

 

Aborted Module Name:   OSYSJOBS_08.OSYSLLNK_01

  Date:        Day:      Time:          Resolution:

03/25/12      Sun       16:37           Restarted by Robin.

 

 

Error log and follow up comments:

 

 

Here's more about the error:

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> find / -name tmp -prune -o -name proc -prune -o -type l -ls

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> 1> /home/root/list_of_links

find: 0652-019 The status on /ais02/dat/work/prod/OSYSJOBS_02.OSYSPURG_01.7869420.7869423.00_orahomes is not valid.

<#/ais02/job/prod/sys_llnk_ssh.ksh.27#> errtrap_ssh /ais02/job/prod/sys_llnk_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

 

We sometimes see these "status not valid" errors with the find command, which tend to be just a timing thing in that a "temp" file existed when the find command "found" it, and then was subsequently deleted, which confuses the find command.

Please restart the failed OSYSJOBS_11.OSYSLLNK_01 component.

Same deal on the failed OSYSJOBS_08.OSYSLLNK_01 - please restart it too.

Janice.

 

 

 

Aborted Module Name: AROSDBIO.AROSS141_01 

  Date:        Day:      Time:          Resolution:

04/05/13      Fri        18:05            Fixed data in TWRCUST and restarted.

 

Error log and follow up comments:

AROSDBIO.AROSS141 aborted in AWPROD for Friday’s schedule. I would prefer to get this one fixed before AppMan is being shutdown due to DB server changes this Sunday.  

 

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: ::Calculated age cannot be

longer than 3 digits. Please check the birth date.::

ORA-06512: at line 104

Gudrun.

 

This is a data issue in Twarbus.  Someone entered the bday incorrectly.

Please restart the process. The birth date year was 0992, I am assuming that it should be 1992.

Don’t think anyone making twarbus transactions was born over 1000 years ago.

Someone should verify the birthday, the TWRCUST/Spriden ID is: 830163783

There is no reason to hold up the schedule for a typo or get more people involved on the weekend.

 

AROSS141 captures the error and printed the following line to the UTL file

Insert failed: 830163783 Lashley -20100 ORA-20100: ::Calculated age cannot be longer than 3 digits. Please check the birth date.::

 

NOTES:

AROSS141 is a very small program that uses the CSUT_BPI_PROCESS_TWRCUST package

-             You can execute the package’s cursor cur_newcust to pull the data that AROSS141 is processing and look for a bad birth date (Year was 0992 instead of 1992)

-             If the UTL file exists, then it pointed to the row in TWRCUST that had the bad birth year

-             CSUT_BPI_PROCESS_TWRCUST package actually calls the gb_identification, gb_bio, gb_address, gb_telephone, etc Banner APIs to create a new customer/person in Banner

 

Resolution:

Short Term

Modified the AROSS141 to handle the data problem and put in TEMP or could have

Write an update statement to fix the bad data and call the DBA on-call to run it in BANPROD, then have the program restarted

Long Term

Modify TWARBUS to check the birthdate so that it will pass the same edits in GB_BIO that the “program” failed on (see GB_BIO_RULES) so that this data entry error will be caught by the form not when AROSS141 runs

--If no dead date, at least check that age <1000 (3 digit limit)

  IF p_birth_date IS NOT NULL AND p_dead_date IS NULL AND

       trunc(( sysdate - p_birth_date)/365)>999 THEN

      p_build_error('BIRTH_DATE_OUT_OF_RANGE');

    END IF;

Josh.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   FAIDDLIM_EV.RERIM12_01

 

  Date:        Day:      Time:          Resolution:

04/25/12      Fri         06:26           Restarted by Joleen.

 

Error log and follow up comments:

 

 

ERROR at line 15:

ORA-12899: value too large for column "GENERAL"."GJBPRUN"."GJBPRUN_VALUE"

(actual: 32, maximum: 30)

 

After talking to the user, David removed crdl12op.2012_04_24_0006.bak.dat from the ais02 directory. I restarted FAIDDLIM_EV.RERIM12_01. FAIDDLIM_DIRECT_LOAN_IMPORT is now complete.

Joleen.

 

 

 

Aborted Module Name:  FAIDSEND.TDCLIENT_SEND_01

  Date:        Day:      Time:          Resolution:

04/30/12      Mon       11:44           See follow up below.

 

Error log and follow up comments:

 

 

+ 1>> /ais01/dat/work/prod/FAIDSEND.TDCLIENT_SEND_01_jobstat

+ cat /ais01/dat/work/prod/FAIDSEND.TDCLIENT_SEND_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open /userfiles/Ufaid/data/crdl13in_2643924.xml.

 

 

FYI

There is a file with this name, crdl13in_2643924.xml, which exists in /oraapps/BANPROD/out/, which was created:   12/04/30  11:36

Did the user run something within gurjobs (via Banner application) to create this file… and now they want it sent?

Do they want us to copy this file to /userfiles/Ufaid/data so they can review it before transmittal?

Janice.

 

I worked with Karma to fix this.

David.

 

 

 

 

Aborted Module Name:   AREGDYGN.AREGS301_01

  Date:        Day:      Time:          Resolution:

05/02/12      Wed       20:05          See follow up below.

 

Error log and follow up comments:

 

ORA-06512: at line 121

20:05:08 119    BEGIN

20:05:08 120      --Update SORLFOS for NULL dept codes--

20:05:08 121      UPDATE sorlfos a

20:05:08 122         SET sorlfos_dept_code     = (SELECT swblfos_dept_code

20:05:08 123                                        FROM swblfos

20:05:08 124                                       WHERE swblfos_majr_code = a.sorlfos_majr_code

20:05:08 125                                         AND swblfos_activity_date =

20:05:08 126                                             (SELECT MAX(swblfos_activity_date)

20:05:08 127                                                FROM swblfos

20:05:08 128                                               WHERE swblfos_majr_code = a.sorlfos_majr_code)),

20:05:08 129             sorlfos_activity_date = SYSDATE,

 

Please run the query below and figure out what the problem is (too many rows) and see if Jerry or his folks can't resolve the issue.

Vicki.

 

I took a look at the problem and saw that in SWBLFOS there were two entries made yesterday for the IEAQ MAJR_CODE, one by Jerry and one by Sue. They may have different start and end terms, but AREGS301 doesn't take that into account, it goes by the MAX Activity Date - one has to go for now. And I'm guessing AREGS301 may have to change to take the term into account.

Peter.

 

I have deleted the offending record.

Jerry

 

I restarted AREGS301 and AREGDYGN_DAILY_GENERAL has completed.

Joleen.

 

If I re-enter the offending record today would AREGS301 run to completion?  The activity dates for the two records should be different...

Jerry.

 

A question I have is which Department Code would you like to get picked up when AREGS301 runs since it would get the record with the MAX Activity Date. The Department Code that is associated with the record that was entered yesterday, 1784, would never get picked up if you entered another record today.

Peter.

 

Ok - that would not be good...  I'll hang onto the transaction until AREGS301 is modified or fall 2012 starts.

Jerry.

 

 

 

 

 

 

 

Aborted Module Name:   KFSXCS31.KFSXS052_01

  Date:        Day:      Time:          Resolution:

05/14/12      Mon       07:23           Restarted by Dermot.

 

Error log and follow up comments:

 

 

EORA-01555: snapshot too old: rollback segment number 34 with name

"_SYSSMU34_400256035$" too small

ORA-02063: preceding line from KRPRD@KRUSER

ORA-06512: at line 168

 

07:23:31 160  Begin

07:23:31 161 

07:23:31 162  --Document the program has started.

07:23:31 163             DBMS_OUTPUT.PUT_LINE

07:23:31 164   ('**** Start of KFSXS052 ' ||to_char(sysdate,'MM/DD/YYYY HH24:MI:SS'));

07:23:31 165 

07:23:31 166  file_handle1 := UTL_FILE.FOPEN (utlpath, outfile1, 'W');

07:23:31 167 

07:23:31 168  For doc_rec in C1 Loop

07:23:31 169       IF v_written < v_max_count Then

07:23:31 170           outdata := doc_rec.fdoc_nbr;

07:23:31 171           UTL_FILE.PUT_LINE(file_handle1, outdata);

07:23:31 172           v_written           := v_written + 1;

07:23:31 173       End if;

 

Shawn increased the space on Rice, will you restart the job?

Josh.

 

 

 

 

Aborted Module Name:   CLMSDATA.SSH_EXEC_01

  Date:        Day:      Time:          Resolution:

05/30/12      Wed       05:44           See follow up below.

 

Error log and follow up comments:

 

# 2012.05.30-05:52:19 : >    Description: An error occurred with the following error message: "Mailbox unavailable. The server response was: 5.1.1 <heidi.kerr@colostate.edu>... User unknown".

# 2012.05.30-05:52:19 : > End Error

Heidi Kerr is no longer with CSU.

Please replace her email with Celeste Ulland (Celeste.Ulland@Colostate.EDU).

Phil.

 

This email is actually stored in the procedure (SQL package) we are executing remotely.  However, it is not the only warning/error in the script.  Ultimately, the script failed with:

 # 2012.05.30-05:52:19 : > Warning: 2012-05-30 05:52:19.50^M

# 2012.05.30-05:52:19 : >    Code: 0x80019002^M

# 2012.05.30-05:52:19 : >    Source: BannerUpdateDriver ^M

# 2012.05.30-05:52:19 : >    Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED.  The Execution method succeeded, but the n

umber of errors raised (1) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the nu

mber specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.^M

# 2012.05.30-05:52:19 : > End Warning^M

# 2012.05.30-05:52:19 : > DTExec: The package execution returned DTSER_FAILURE (1).^M

Elden.

 

FYI – just to add to Elden’s conclusion

Here is the command that is being remotely executed once ssh connection has been established. We can’t make any changes to those files.

cmd.exe /V:ON /C

"D:\Program Files\SCT\clm\DtsxPackages\dtexec_wrapper.cmd"

"file=D:\Program Files\SCT\clm\DtsxPackages\BannerUpdateDriver.dtsx"

"config=C:\Users\sshuser\.ssh2\BannerUpdateDriver.cfg"

Gudrun.

 

I’ve updated the email list in the DTSX package and copied to the CLM server.

Can you add me to the IS DL: Alert CLMS HOUS group?

~Steven Dove.

 

We do not add non-IS staff to these lists.  Instead, we request that the user department set up their own list and then we add that list to the appropriate address file or other appropriate object for the process flows.  We do not want to be responsible for maintaining user address lists.  Janice has gone through a lot of work to support this methodology.

Elden.

 

There are two potential lists that you could be added to:

BFS_CLM_FTP  (used for CLMSSEND_NSLDS_DATA and CLMSSGET_NSLDS_ERROR_FILE) BFS_CLM_PRODUCTION (used for CLMSAM99_CLMS_SCHEDULE_DONE) I believe that the BFS personnel are responsible for updating these lists.

Also, you should also remove Jan Mueller (retired)and Phil Chambers from the BFS_CLM_FTP list.

We also noticed that Heidi Kerr is on this list.

David.

 

Chris Glaze has updated the lists.  Thanks for pointing me in the right direction.

~Steven Dove.

 

 

 

Aborted Module Name:   KFSXCS31.KFSXS052_01

  Date:        Day:      Time:          Resolution:

09/29/12      Fri         22:11           Restarted by Dermot.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-01555: snapshot too old: rollback segment number 15 with name

"_SYSSMU15_2222981817$" too small

ORA-02063: preceding line from KRPRD@KRUSER

ORA-06512: at line 168

 

Please restart the module.
Thanks
Josh.

 

 

 

Aborted Module Name:   FAIDALIM.SSH_SFTP_RN_02

 

  Date:        Day:      Time:          Resolution:

10/25/12      Thu       07:02           Restarted by Joleen.

 

Error log and follow up comments:

 

 

# > Permission denied (password,gssapi-with-mic).

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

 

I just tested the connection and it worked.

Since this job originally failed during the connection, unless the process flow and job notes suggest otherwise, you may reset the job.

Elden.

 

There were no notes and no conditions. I reset the job. FAIDALIM_ALTERNATE_LOAN_IMPORT has finished running.

Joleen.

 

 

 

 

Aborted Module Name:  KFSXPDCA.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

07/17/12      Tue       19:10           Requested in again by Dermot.

 

Error log and follow up comments:

 

 

 

2012-07-17 19:10:42,431 [RMI TCP Connection(9)-129.82.127.238] FATAL org.kuali.rice.core.database.KualiTransaction

Interceptor :: Exception caught by Transaction Interceptor, this will cause a rollback at the end of the transacti

on.java.lang.NullPointerException

 

2012-07-17 19:10:42,267 [RMI TCP Connection(9)-129.82.127.238] INFO  org.kuali.rice.kns.service.impl.DocumentServi

ceImpl :: storing document 1919369

 

2012-07-17 19:10:42,672 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: java.lang.N

ullPointerException

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

<</ais02/job/prod/kshexe_ssh.90>> errtrap_ssh kshexe_ssh 1

Remote Shell errtrap_ssh parm 2 value is 1

<<errtrap_ssh.3>> [[ 1 > 0 ]]

<<errtrap_ssh.6>> print *** \n*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Malta SCRIPT ABORTED - EXIT CODE=1

 

As per John, It appeared that there was a glitch with RICE while this JAVA step was running: processPdpCancelsAndPaidStep

Process Flow / Chain was restarted the next morning and completed successfully.

 If this ABORT happens again then check that format checks "KFSXPDDR" is not running before requesting the KFSXPDCA Process Flow.

Dermot.

 

 

 

Aborted Module Name: APMXMISC.APMXPURG_01  

  Date:        Day:      Time:          Resolution:

06/27/12      Mon       13:35           Restarted by Joleen.

 

 

Error log and follow up comments:

 

 

rm: 0653-609 Cannot remove ./obs/COMPLETION.ACRD.20081031_092518.

The file access permissions do not allow the specified action.

rm: 0653-609 Cannot remove ./obs/FTP_EPRINT.SH.20030818_104042.OBS.

The file access permissions do not allow the specified action.

rm: 0653-609 Cannot remove ./obs/FTP_EPRINT.SH.20050505_131713.OBS.

The file access permissions do not allow the specified action.

 

What is the status on this jobs directory ?  Do we want it to be part of the regular APMXPURG removal of files in /appworx/csu/exec ?

Gudrun.

 

Just a permissions issue. The apmxpurg script cleaned up timestamps for the first time today. Greg fixed the permissions issue for the obs directory.

David.

 

 

 

 

Aborted Module Name:   APMXLOOK_AM.SEND_MAIL_04

  Date:        Day:      Time:          Resolution:

08/20/12      Mon       08:00          Deleted by David.

08/04/14      Mon       08:00          Deleted by David.

 

Error log and follow up comments:

08/20/12.

#   --> --options=" ERROR -999 ORA-01722: invalid number"

#   --> --options=""

# > (3)

#==============================================================================

# FATAL : Command failed with code : 3

#------------------------------------------------------------------------------

# RETURN CODE = 100 (/appworx/csu/exec/build_parms_with_multiselect.pl)

#==============================================================================

# Passing Parms : arg=[ from="jobprd@mailer.is.colostate.edu" reply_to="/ais01/dat/misc/mailst/SEND_MAIL.IS_SUPPORT_SCHEDULING.LST" to="Jobs" cc="-" bcc="Production" subject="Jobs" --options=" ERROR -999 ORA-01722: invalid number" --options=""] /usr/bin/perl /appworx/csu/exec/SENDMAIL.PL  from="jobprd@mailer.is.colostate.edu" reply_to="/ais01/dat/misc/mailst/SEND_MAIL.IS_SUPPORT_SCHEDULING.LST" to="Jobs" cc="-" bcc="Production" subject="Jobs" --options=" ERROR -999 ORA-01722: invalid number" --options=""

#==============================================================================

# [ 2012.08.20-08:00:30 ]

#******************************************************************************

# FATAL : < main::parse

# FATAL : Unknown option ( ERROR -999 ORA-01722: invalid number)

 

I deleted this. This will hopefully be fixed next time it runs. I added account jobprd to the /ais01/dat/apmx/prod/APMXLOOK_EMAIL_OVERRIDE.DAT file to send the Email to APMX developers if the owner of the temp file is jobprd.

David.

 

08/04/14.     

Mon Aug 04 08:05:30 MDT 2014                                                                                                        Page 1

                                              Check Backlog for ABORTED jobs (so_status  202)                                             

Job                      Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

------------------------ -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

APMXLOOK_AM.SEND_MAIL_04          08-04-2014 08:00:46 MDT    202 ABORTED                                        279

 

I fixed APMXLOOK_AM.SEND_MAIL_04.

I added Steve Greene to the /ais01/dat/apmx/prod/APMXLOOK_EMAIL_OVERRIDE.DAT file so his Email would be derived correctly in the future.

I had to manually fix the prompt in the APMXLOOK_AM.SEND_MAIL_04 for this one and re-start.

David.

    

 

 

Aborted Module Name:  APMXLOOK_AM.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

03/18/13      Mon       08:00          see note from Steve below.

 

Error log and follow up comments:

 

# - Sending Message

#   MIME::Lite version  : 3.027

#   MAIL COMMAND        : smtp.colostate.edu , Debug => '0', Timeout => '60'

#   BUILDING HEADERS

#   BUILDING BODY

SMTP recipient() command failed:

5.2.2 mail delivery suspended,mailbox full

 

error is 255

===== Exiting PERL_CSU =====

+ err=255

I removed the CC Email address and it finished successfully.

 

Elden, it looks like one of these mailboxes is full:

eflick@lamar.colostate.edu

eflick@mail.colostate.edu

elden.flick@colostate.edu

Steve.

 

The lamar mailbox was showing mostly empty, but evidently was only marking files as deleted instead of deleting them!  It should be good for now.

Elden.

 

 

Aborted Module Name:   ODSRKFSX.ODSRS002_01

  Date:        Day:      Time:          Resolution:

09/25/12     Tue        00:53           See follow up below.

 

Error log and follow up comments:

 

 

Can you send me the location of the log file from whatever job it was that you called me about?  I see no email yet with the job number or anything.

Mark. B.

 

Just sent

Gudrun.

 

Can you restart this job easily?  I can’t really see anything wrong at this point.

Mark. B.

 

If it can’t be resolved at this point only ODS schedule will be delayed to complete given the dependencies out there for that component.

KFSX schedule actually completed already.  

If you want to reset  call 970 581 5577. Probably this one will wait until regular work hours I assume for being resolved.

Gudrun .

 

I will leave an email for Mark with my findings…somewhere we have a password issue but I can’t find the problem.  Hopefully he’ll know what it is right away.  After that I’m logging out and going to bed.

Mark. B.

 

Mark B needs to talk to Mark P during the day.  ODS schedule will be delayed to complete. Other schedules moving except for EID schedule which has another abort due to be resolved during the day.

However post AGEN notify file creation. J

Gudrun .

 

One of you can re-start the failed chains.  I have corrected the connection problems on ODS Prod.

This was caused by me yesterday afternoon while I was finishing post clone steps on ODSDevl and I accidently changed a couple of passwords on ODSProd.

Mark P.

 

 

 

Aborted Module Name:   EIDSUPDT.HRMSS111_01

  Date:        Day:      Time:          Resolution:

09/24/12     Mon        22:24           Restarted by Robin.

 

Error log and follow up comments:

 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

829886569 ORA-01422: exact fetch returns more than requested number of rows

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

 

Bob and I took a quick look at this abort and did not see any reason for the problem. 

Robin - can you please restart HRMSS111 and see if it doesn't work fine this time.

Vicki.

 

 

 

 

Aborted Module Name:  ADMSLETA_ADMIT_DENY_LETTERS

  Date:        Day:      Time:          Resolution:

09/28/12     Fri         22:01           See follow up below.

 

Error log and follow up comments:

 

We had a problem with our admit/deny program so no Admit letters were generated on Friday night (for Spring, Summer & Fall semester). We are looking into this and will give you an update later today.

Could you tell us what web page this job is calling? It would be helpful in troubleshooting the problem.

Erica Burr.

 

URL is https://wsnet.colostate.edu/ai/appworx/RunLetters.aspx

David.

 

We gave you the wrong URL it should https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx.

Can you please update the Appworx job for - ADMSLETA_ADMIT_DENY_LETTERS to call https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx.

 

No need to rerun the job. I manually ran it and its fine.

 

https://wsnet.colostate.edu/ai/appworx/RunLetters.aspx - ADMSLETA_ADMIT_DENY_LETTERS

Dependency: No

Run Time: 8:00 PM

Occurs: Daily Monday - Sunday

Description: admit/deny letters.

Erica Burr.

 

I have replaced the URL with:

https://wsnet.colostate.edu/ai/Tools/Letters/RunLetters.aspx

 

 

Joleen.

 

 

 

 

 

Aborted Module Name:  HRMSS006.FTPS_CURL_01

  Date:        Day:      Time:          Resolution:

10/06/12     Sat         14:19           Restarted by Steve.

 

 

Error log and follow up comments:

 

 

# > * SSLv3, TLS handshake, Finished (20):

# > } [data not shown]

# > * SSLv3, TLS change cipher, Client hello (1):

# > { [data not shown]

# > * SSLv3, TLS handshake, Finished (20):

# > { [data not shown]

# > * SSL connection using DHE-RSA-AES128-SHA # > * Server certificate:

# > *  subject: /C=US/ST=CO/O=SDNH/L=Denver/OU=300/emailAddress=netadmin@policy-studies.com/CN=www.sdnh.state.co.us

# > *  start date: 2007-05-02 19:08:08 GMT

# > *  expire date: 2017-05-03 02:08:07 GMT

# > *  common name: www.sdnh.state.co.us (does not match 'sdnh.state.co.us')

# > *  issuer: /C=US/ST=CO/O=SDNH/L=Denver/OU=300/emailAddress=netadmin@policy-studies.com/CN=www.sdnh.state.co.us

# > * SSL certificate verify result: self signed certificate (18), continuing anyway.

# > > USER CSU

# > * FTP response reading failed

# > * Closing connection #0

# > * SSLv3, TLS alert, Client hello (1):

# > } [data not shown]

# >

# > curl: (56) FTP response reading failed # > (56) #==============================================================================

# FATAL : Command failed with code : 56

 

 

I reset this and it finally finished...

Steve G.

 

As per Steve this ABORT with the above error message is ok to restart.

Dermot.

 

 

 

Aborted Module Name:   HRMSCPR_SAL_HRMSS063_01

  Date:        Day:      Time:          Resolution:

10/09/12     Tue         08:22           Restarted by Robin.

 

 

Error log and follow up comments:

 

+---------------------------------------------------------------------------+

Start of log messages from FND_FILE

+---------------------------------------------------------------------------+

+---------------------------------------------------------------------------+

End of log messages from FND_FILE

+---------------------------------------------------------------------------+

**** Start of HRMSS063 10/09/2012 08:22:43

Amount Not Distributed: Hadrich,Joleen         Moving Reimbursement      1371910

837.08

declare

*

ERROR at line 1:

ORA-20000: **** FATAL ERROR! Some Money Could Not Be Distributed! ****

ORA-06512: at line 1063

 

Robin,

Go ahead and retry it.  I added the account to the GL_CODE_COMBINATIONS table so we should be good to go.

Steve H.

 

 

 

Aborted Module Name:   KFSXAM11.WAIT_ENCUMB_DEL_01

  Date:        Day:      Time:          Resolution:

10/11/12     Thu         01:58           Restarted by Joleen.

 

 

Error log and follow up comments:

 

 

10/11/2012 01:58    JWEARNE

I received a Page just after 1am. The message said Application manager Agent not running. I saw that KFSXAM11.WAIT_ENCUMB_DEL_01 was aborted. There was a message in Comments that said:

Agent error : timeout  SeqNo 170863 Agent AWPROD Master AWPROD service AWPROD Method openFile [$SQLOPER_HOME/out/KFSXAM11.WAIT_ENCUMB_DEL_01.9122369.00.txt, false, false] : null

2012-10-11 01:55:46

There was no output file. The only KFSX job waiting to run was KFSXAM99. The only HRMS jobs waiting to run are HRMSAM99 and HRMSS033 which is waiting for 03:00. I saw in the comments query that this job had aborted on 9/13/2012.

I did a history, I saw that Gudrun had restarted the job that night, so I restarted this one and it finished.

 

 

 

Aborted Module Name:  ADMSSRLD_DY.SQLSURLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

10/12/12     Fri         22:23           Restarted by Joleen.

 

Error log and follow up comments:

 

Component ADMSSRLD_DY.SQLSURLOAD-LOOP_01 aborted in AWPROD with below error message for letter AEML_MU3AS3 and sql executed at time. It appears other letters completed except

For AEML_MU3S3 .

Suggested Resolution Path:  First answer question then act.

 

Questions: Is the sql ok ?    If sql is not ok what is the correct sql. We need to rerun letters just for that sql. Provide a temp .dat file and reset.

 

If sql is ok should it have generated letters.

 

    If not delete component let other things finish. Identify reason why failed in AppMan the sql ?

 

    If letter should have been generated and sql is ok we need to rerun for those letters only – provide temp .DAT file with only that sql I assume. Also Identify reason why sql failed in AppMan.

 

SP2-0734: unknown command beginning "and c.sarc..." - rest of line ignored.

SP2-0044: For a list of known commands enter HELP

SP2-0734: unknown command beginning "and c.sarc..." - rest of line ignored.

SP2-0734: unknown command beginning "and e.gore..." - rest of line ignored.

SP2-0734: unknown command beginning "and e.gore..." - rest of line ignored.

SP2-0734: unknown command beginning "and e.gore..." - rest of line ignored.

SP2-0044: For a list of known commands enter HELP

Gudrun.

 

I think Bev will need to take a look at ADMSSRLD on Monday when she gets back. I'm pretty sure she will need to talk to the user about this one.

 

When EIDSUPDT aborts Vicki manually runs the sql section in question and many times she doesn't find an error. She has me restart and it usually finishes without aborting again. We can wait until Monday to have someone look at this if you like. I'm wondering if we should just restart since it doesn't seem to hurt anything. Also, Vicki had a family emergency and had to leave town. I'm not sure if any of my team members know EIDS as well as she does. What do you think?

Joleen.

 

Thanks for the info. Yes lets restart EIDSUPDT and see what happens. It was already restarted once anyway. No conditions attached. About letters I added some more information. Maybe Rob or Rami can help. At this point I think the sql is correct. Does not return anything and the component can be deleted provided the other letters got generated but waiting for confirmation on that.

Gudrun.

 

 

 

 

Aborted Module Name: AREGDYMP.VPLUS_RCAP-LOOP_01  

  Date:        Day:      Time:          Resolution:

10/14/12     Sun         22:15           Restarted by Joleen.

 

Error log and follow up comments:

 

+ spawned_module_name=VPLUS_RCAP

+ . SRC_APMX_STATUS_FOR_SPAWNED.KSH

+ set -x

+ awexe jh

+ grep 9144166

+ egrep ABORTED|CRITFAIL|C-Error

      9144166.00 BATCH     AREGDYMP.VPLUS_RCAPT10/14 22:16 00:00:02 ABORTED                AREGDYMP_MATH_PLACEMENT_LOAD

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

+ [ 1 -eq 0 ]

+ [ 1 != 0 ]

+ status=ABORTD

 

This looper actually spawns another job to do the captures.  The log for the failed spawned job is:

  AREGDYMP.VPLUS_RCAPTURE_01.9144158.9144166.00.2012_10_14_2215.AWPROD.LOG

The error in this log is:

  Login successful!   Host: vplusprod.is.colostate.edu   Port: 7980

  Continuing with data transfer.

  ERROR: Reading Packet from server: error=-3

  ERROR: Capture processing terminated.

  ERROR: Network read error: 73; Connection reset by peer

  ERROR: Reading Packet from server: error=-2

  + exit 17

  error is 17

This is the same error Rich and I have been working on.

 

I checked if the file that RCAPTURE was trying to capture to Vista Plus was in Vista Plus -- it wasn't.  I also checked the driver for the RCAP-LOOP and found it just had the one entry, so I reset they job. 

I confirmed the report is in Vista Plus.

Elden.

 

 

 

 

Aborted Module Name:  EIDSUPDT.EIDSS002_01

  Date:        Day:      Time:          Resolution:

10/14/12     Sun         22:15           Restarted by Joleen.

01/04/13     Fri          22:28           Restarted by Joleen.

 

Error log and follow up comments:

 

10/14/12.

I received a DBA cell call at 2:20 am. Nobody stated anything. However, I logged on and saw two job components aborted.

These have to be followed up during the daytime. EIDSUPDT will delay ODS. I made a Tracker web entry.

 1.       EIDSUPDT.EIDSS002_01

Processed 100 rows

Processed 100 rows

Processed 100 rows

Processed 100 rows

declare

*

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 27

Gudrun.

 

Recently when EIDSUPDT.EIDSS002_01 has aborted, I have been asked to just restart it. After checking the Abort log and consulting with Gudrun, I restarted EIDSUPDT.EIDSS002_01 and it finished running just before 10am.

Joleen.

 

01/04/13.

ERROR at line 1:

ORA-06502: PL/SQL: numeric or value error

ORA-06512: at line 23

 

This job aborted on Thursday with the same error. Peter worked for hours trying to figure out what the problem was. In the end we just restarted the job and it finished. I am going to restart this job and hope it finishes. It is holding up the EIDS PROD and HRMS refresh.

Joleen.

 

 

 

 

 

Aborted Module Name:   FAIDSAIG_OD.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

10/17/12     Wed        00:15          See note from Gudrun & David below.

10/18/12     Thu         01:28          See note from Gudrun & Elden below.

 

Error log and follow up comments:

 

Acknowledge AppMAN FAIDSAIG_OD.TDCLIENT_01 abort in AWPROD. Unlike yesterday night TDCLIENT_01 processing completed partially. It failed during NOTISIR processing. I am contacting David and/or Elden. Need confirmation to ONLY rerun NOTISR section

of that run.

# 20121017-000941 : pipe_exec                 | cmdout = <debug1: Exit status 139

# 20121017-000941 : *** FATAL ***main::check_status | SEND_TO_CMD (close) [0] failed to execute (139)

# 20121017-000941 : *** FATAL ***main::check_status | (100)

# 20121017-000941 : *** FATAL ***main::check_status |

# 20121017-000941 : *** FATAL ***main::check_status | CMDOUT (close) [1065140] failed to execute (100)

# 20121017-000941 : *** FATAL ***main::check_status | (100)

# 20121017-000941 : *** FATAL ***main::check_status |

This is a non-critical abort ; however, because it delays the ODSR schedule which still is critical I made an entry in Tracker Web.

Gudrun.

 

Gudrun called to report FAIDSAIG_OD.TDCLIENT_01 had failed.

I verified her findings that it failed during NOTISIR processing. I advised to contact Elden to get help in the appropriate recovery.

David.

 

10/18/12.

Segmentation fault error. Component is non-critical;however, I informed Elden.

Gudrun.

 

FAIDSAIG_OD.TDCLIENT - core dump the second night in a row.  As with the failure the night before, the ISIR files were downloaded and accumulated successfully.  However, this time we received one non-ISIR file CRDL13OP and failed on the CRPG13OP download. I copied the CRDL13OP file from the work directory to where it would have been copied on success (/userfiles/Ufaid/data). Since it was already downloaded, TDCLIENT would not find in when I reset the process flow. However, the logic will think it was left over from a previous run and process it as expected. Next, I changed the "Run ISR files" prompt in backlog from "BOTH" to "NOTISR" to pick up just the non-ISIR files.

*  I put the next component in the process flow on hold

*  I reset the TDCLIENT

*  When it finished sucessfully and the log looked good, I released the hold on the next component

It looks like we didn't receive the CRPG13OP successfully in the first run and it wasn't found to download in the second run, so Financial Aid may need to research and request the file again if needed.

 

The default is 'BOTH' which runs ISIR-s, then non-ISIR-s.  Since it failed after ISIR-s, I changed 'BOTH' to 'NOTISR' to just pick up the non-ISIR-s.  However, since we are running TDCLIENT from Quartz (as of a few months ago), we also have to manually copy any successfully downloaded files from the working _proxy_ directory back to the original directory (/userfiles/Ufaid/data/ in this case) before we reset with the 'NOTISR' option.

 

Here is the modlog for the change to the script:

 

#* 09/09/2011--GK--T07365--Pass in prompts for ftpfrom/ftpto_user      *

#*                         Allow for easy resetting of runtype RECEIVE *

#*                         by specifying which files to process to var *

#*                         run_isr_files:ISR, NOTISR or BOTH.Dft:BOTH  *

For now, please contact me for aborted TDCLIENT jobs

Elden.

 

 

 

Aborted Module Name:   FAIDTRAK_OD.GLBDATA_21

  Date:        Day:      Time:          Resolution:

10/17/12     Wed        06:57          Restarted by Steve.

 

Error log and follow up comments:

 

 

ORA-03135: connection lost contact

Process ID: 0

Session ID: 0 Serial number: 0

 

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0306: Invalid option.

Usage: CONN[ECT] [logon] [AS {SYSDBA|SYSOPER|SYSASM}] where <logon>  ::= <username>[/<password>][@<connect_identifier>] [edition=value] | /

SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus Now turn on set -x for debug purposes

+ [ -f login.11174695 ]

+ echo Could not log in to SQL*Plus.

Could not log in to SQL*Plus.

+ echo Exiting with error (return code = 5).

Exiting with error (return code = 5).

+ exit 5

+ err=5

 

The last GLBADATA_21 did not start processing. Current iteration count for glbdata loop is at 20.

Resetting the GLBDATA-LOOP component will restart processing from where we left off.

Gudrun.

 

 

 

Aborted Module Name:   FAIDALEX_OD.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

10/18/12     Wed        18:02          Restarted by Elden.

04/10/13     Wed        18:01          See note from Gudrun.

 

Error log and follow up comments:

 

10/18/12.

FAIDALEX_OD.SSH_SFTP_01 -- while I was working on the FAIDSAIG aborted job, I also noticed that this FAIDALEX process flow also aborted.  I researched this and believe the file was transferred ok; it looks like ELM processed it so fast that we couldn't find it when we tried to list it.

I found the file in the transferred folder and it matches the file we sent.

*  Since the file was already uploaded, I decided to confirm connectivity and finish the job successfully, so I changed the prompts to download the copy from the remote transferred directory.

*  I reset the job and it finished successfully.

 

FOLLOW UP:

*  While testing, sometimes the SSH identity file was failing and instead was trying to use password authentication.  We need to research this a bit.

*  We probably want to add the "nofatal_list_after_put option" to the job

Elden.

 

 

04/10/13.

Wed Apr 10 18:05:45 MDT 2013                                                                                                       Page 1

                                             Check Backlog for ABORTED jobs (so_status  202)                                            

Job                     Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

----------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

FAIDALEX_OD.SSH_SFTP_01 10256259 04-10-2013 18:01:24 MDT 202 ABORTED 2102.24        248                     12

 

FYI - just looked at the output - a thought

We may need check for this abort if they received the file. It appears sftp worked to connect but host failed to communicate back?

Gudrun.

 

 

 

Aborted Module Name:   FAIDSAIG_OD.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

10/19/12     Fri          00:09          See followup from Elden below.

 

Error log and follow up comments:

 

10/19/2012 01:41    EFLICK

FAIDSAIG_OD.TDCLIENT_01 - Researched - copied file - reset -  failed again - researching more - will give additional details in later note

 

10/19/2012 02:54    EFLICK

FAIDSAIG_OD.TDCLIENT_01 - successful

1st failure / core dump:

Indications are that we were processing 3 files for message class CRDL13OP - one new and 2 resends from previous night

aborts:

* looks like batch 2012-10-18T07:46:15.422369043 downloaded to CRDL13OP ok

* looks like batch 2012-10-16T08:55:19.502369043 failed causing core dump

* looks like batch 2012-10-16T08:47:25.942369043 also not downloaded

+ I copied the CRDL13OP from working _proxy_

to /userfiles/Ufaid/data

+ changed the "Run ISR files" prompt to "NOTISR"

+ reset job

 

2nd abort with another core dump:

* looks like CRDL13OP.002 batch 2012-10-

16T08:47:25.942369043 downloaded ok (file

20121016A00259467416)

* looks like aborted in CRECMYOP for batch 2012-10-

18T04:03:58.3000000001 (file 20121018A00259738479)

+ I copied the new CRDL13OP from the _proxy_ temp dir

to /userfiles/Ufaid/data

+ confirmed the prompt was "NOTISR"

+ reset job

 

FOLLOW UP:

* Fin Aid probably didn't get one of the CRDL13OP batches that failed before (batch 2012-10-16T08:55:19.502369043)

* Fin Aid didn't get the CRECMYOP (batch 2012-10-

18T04:03:58.3000000001)

* Evidence is mounting that we have data file corruption triggering the core dumps.  We need to check with SAIG:

  * Is there a known issue and patch available?

  * Why did it start this week?

  * I have logs and core dumps which we can send to SAIG support if needed

 

We did reboot Quartz this afternoon, so it should be very 'clean' resource-wise.

 

 

 

Aborted Module Name:  AREGDYMP.VPLUS_RCAP-LOOP_01

  Date:        Day:      Time:          Resolution:

10/28/12     Sun        22:16           Restarted by Steve.

 

Error log and follow up comments:

 

 

+ egrep ABORTED|CRITFAIL|C-Error

      9231188.00 BATCH     AREGDYMP.VPLUS_RCAPT10/28 22:16 00:00:02 ABORTED                AREGDYMP_MATH_PLACEMENT_LOAD

+ print Failure in spawned VPLUS_RCAP - abort this module

Failure in spawned VPLUS_RCAP - abort this module

+ exit 1

+ err=1

 

I followed Elden's approach from when this aborted with the same error on 10/14 --  checked if the file that RCAPTURE was trying to capture to Vista Plus was in Vista Plus -- it wasn't.  I also checked the driver for the RCAP-LOOP and found it just had the one entry, so I reset the job.  It finished and I confirmed the report is in Vista Plus.

Steve.

 

 

 

Aborted Module Name:   KFSXAPPO.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

10/29/12     Mon        20:07          Follow up below.

 

Error log and follow up comments:

 

 

2012-10-29 20:08:18,272 [RMI TCP Connection(150)-129.82.127.238] INFO  org.kuali.kfs.module.purap.document.service

.impl.PurchaseOrderServiceImpl :: autoCloseFullyDisencumberedOrders() PO ID 352881 with total 845.03 will be closed

2012-10-29 20:08:18,668 [RMI TCP Connection(150)-129.82.127.238] ERROR org.kuali.rice.kns.util.ObjectUtils :: erro

r getting property value for  class edu.csu.kfs.module.purap.document.PurchaseOrderDocument.checkPostingYearForCop

y Unknown property 'checkPostingYearForCopy'

 

The receiving is messed up on this PO.  John Swaro cannot manually close either. 

This one is going to need a SQL close.

PO# 352881.

Then just let nightly batch run tonight and the job should go.

Theresa.

 

Hi Dermot

We will need to sql close this po.  Please prepare the task.  Use the update statement you asked about. 

We will need a dba.

Josh.

 

 

 

Aborted Module Name:   ODSRAROS.ODSRS003_01

  Date:        Day:      Time:          Resolution:

11/01/12     Thu        00:12          Restarted by Dermot.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUT_AR_SMR_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 21

 

I am manually running the mapping now.

Is there anything dependent on this finishing successfully?

Mark. P.

 

Just the ODSRPROD_REFRESH_ODSPROD, all the other Refreshes have completed.

Dermot.

 

Joleen stopped by and said the same thing (more or less), but I need to know more detail of what ODSRPROD_REFRESH_ODSPROD is. 

What components does it contain?

Mark. P.

 

These are the components that are left to run in ODSRPROD:

 

 

APMX FOLLOW UP = Generate various followup reports based on files present in /ais01/dat/misc/followup

SEND_MAIL = An email to update us on all the Refreshes

ODSRS004_02 = Log ODS Refresh Begin/End Times

CHAIN_FINISH = Chain Finish Tasks - cleanup/backup files, send email, etc.

Joleen.

 

It completed successfully in 1:48:57, go ahead and release the remaining jobs/components.

BEGIN

csug_run_owb_task('OWBREP', 'ODS_CSUBAN_LOCATION', 'PLSQL', 'LOAD_CSUT_AR_SMR_FRZ');

END;

/

PL/SQL procedure successfully completed

SQL>

SELECT count(*) FROM CSUT_AR_SMR;           --149,192 records

SELECT count(*) FROM CSUT_AR_SMR_FRZ;  --149,192 records

 

And thank you Joleen for the additional details on what was left to run.

Mark. P.

 

 

 

 

 

 

 

Aborted Module Name: AREGHRTM_FA.AREGS415_01    

  Date:        Day:      Time:          Resolution:

11/05/12     Mon        00:01          Restarted by Joleen.

 

 

Error log and follow up comments:

 

 

FA had this comment:

No more data to read from socket

I checked for conditions, I restarted and it finished.

Joleen.

 

 

 

Aborted Module Name: AREGHRTM_SP.AREGS415_01 

  Date:        Day:      Time:          Resolution:

11/05/12     Mon        00:01          Restarted by Joleen.

12/31/12     Mon        00:04          Restarted by Dermot.

01/07/13     Mon        00:02          Restarted by Joleen.

 

Error log and follow up comments:

 

11/05/12.

SP had this comment:

Closed Connection

I checked for conditions, I restarted and it finished.

Joleen.

 

12/31/12 .

Rec'd page:

"LAUNCH ERROR AREGHRTM_SP.AREGS415_01 9607078.01".

Restarted AREGHRTM_SP.AREGS415_01 /

AREGHRTM_SECTION_ENROLLMENT which has now finished.

Dermot.

 

01/07/13.

I was paged with the following message:

“LAUNCH ERROR AREGHRTM_SP.AREGS415_01 9646219.01”

There is no output file and no conditions. I restarted the job and it has finished.

Joleen.

 

 

Aborted Module Name:  AREGTTRN.RWCLIENT_01

  Date:        Day:      Time:          Resolution:

11/07/12     Wed        06:22          Restarted by Joleen.

07/26/13     Fri          16:37           Restarted by Joleen.

 

Error log and follow up comments:

 

11/07/12.

<<errtrap_ssh.6>> print *** \n*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1 \n***

***

*** ERROR: Sneffels SCRIPT ABORTED - EXIT CODE=1

***

<<errtrap_ssh.7>> exit

+ grep SCRIPT ABORTED

+ /ais02/log/AREGTTRN.RWCLIENT_01.9291261.9291275.00.2012_11_07_0622.log

+ 1> /dev/null

+ print rwclient execution unsuccessful

rwclient execution unsuccessful

+ cat /ais02/log/AREGTTRN.RWCLIENT_01.9291261.SEND_MAIL_ERR.DAT

REP-0177: Error while running in remote server Engine rwEng-0 crashed, job Id: 276171

 

Mark P. re-booted and we restarted the component and it completed.

Joleen.

 

07/26/13.

The RWCLIENT output said it couldn't establish a connection to Sneffels. When I was investigating my AppMan screen went black. When I was able to log back in to Appman I restarted the RWCLIENT and I restarted the OSYS system related process flow that was in DB Error. The RWCLIENT failed again with the same error of not being able to establish a connection. I didn't see any DBA's around. I caught Rich as he was leaving and he looked to see if Sneffels was up. Everything looked OK with Sneffels. We restarted RWCLIENT and this time it finished running.

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Friday, July 26, 2013 4:39 PM

To: IS DL: Alert APMX

Cc: gudrun.kokoszka@gmail.com

Subject: AWPROD APMXCHKS Abort Job Backlog Warning

 

Fri Jul 26 16:37:26 MDT 2013                                                                                                    Page 1

                                            Check Backlog for ABORTED jobs (so_status  202)                                           

Job                  Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

-------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

AREGTTRN.RWCLIENT_01 11060761 07-26-2013 16:24:49 MDT    202 ABORTED             7938.56                    749                      9

 

 

 

Aborted Module Name:   AREGDYDL.FTPS_CURL_01

  Date:        Day:      Time:          Resolution:

11/09/12     Fri         20:04           Restarted by Elden.

 

Error log and follow up comments:

 

 

# > > USER $PUCSU

# > < 331 Send password please.

# > > PASS **********

# > < 530 PASS command failed

# > * Access denied: 530

 

I updated the old password with the temp password, then ran the password update, then reset FTPS_CURL_01.  Please see the news file for additional information and let me know if you have questions.

Elden.

 

11/11/2012 02:56    EFLICK

Since we're going to have some maintenance in the morning and would like the schedule as clean as possible, I decided to look into AREGDYDL from Robin's ABORT note.

Since we've been having problems getting the password changed on the remote system and Jerry Becker contacted DOR about this, I suspected they must have reset the password.

I found an email confirming this from Jerry on Friday after I left.

I requested the AREGSPWD_DL_CHG_PASSWORD process flow with a hold, deleted the SEND_MAIL_01 and WAIT_FOR_RLSE since we already had these completed on Friday.  Then I reset this process flow -- it finished successfully.

Then I reset AREGDYDL.FTPS_CURL_01, which finished successfully.

 

 

 

 

Aborted Module Name:  HRMSKFSA.HRMS_SPAWN_OUT_01

  Date:        Day:      Time:          Resolution:

11/12/12     Mon        17:45           Deleted by Robin.

01/16/13     Wed        17:47           Deleted by Robin.

 

Error log and follow up comments:

 

*** COPY SPAWNED CONCURRENT REQUEST OUTPUT TO JOB OUTPUT FILE

+ print *** \n*** OUTPUT FROM SPAWNED CONCURRENT REQUEST 7429647 (PARENT REQUEST 7429624): \n***

+ 1>> /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ cat /oraapps/hrprod/out/o7429647.out

+ 1>> H/ais01/dat/work/prod/RMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ read this_spawned_req

+ cut -f2 -d ?

+ print 7429648?G

+ grep C

+ print *** \n*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION \n***

***

*** SPAWNED CONCURRENT REQUEST - UNSUCCESSFUL COMPLETION

/ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

^L

LDM SOB 2                                         Journal Import Execution Report                              Date: 12-NOV-12 17:41

Concurrent Request ID: 7429647                                                                                 Page:               2

** Batches listed under "Unbalanced Batches**" have not been imported.

No resolution to this ABORT, it was decided to delete the ABORTED job and let the remaining dependent chain “KFSXCS41” complete.

HRMSKFSA.HRMS_SPAWN_OUT_01 / HRMSKFSA_KFS_ADJUSTMENTS ran successfully on the next nightly cycle.

Dermot.

 

01/16/13.

Similar error received as received on 11/12/12.

 

I’ve got Steve Hill looking at this abort.  The exact same thing happened in November with this job.

Steve G.

 

Steve Hill informed Stephen that he will research failed concurrent manager job 7507661. We won’t get output files for this job at this time.  

As a result I reset component HRMSKFSA.HRMS_SPAWN_OUT_01 once I made changes to allow for  correct output file collection.

Deletion of component would have caused NO VPLUS output being captured. The script exits before moving the incomplete output file collected so far to a tempdir from where

our AppMan VistaPlus component picks it up for VistaPlus upload.

Changes made prior to reset of component HRMSKFSA.HRMS_SPAWN_OUT_01 :

1.        Delete   string 7507661?G in file /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_FIND_01.spool.lis

2.        Moved existing file /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out to /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out.bkp

Reset needs to create a new file. (Otherwise existing file would get appended to resulting in duplicate output)

Gudrun.

 

 

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

11/16/12     Fri         05:31           Restarted by Dermot.

 

Error log and follow up comments:

 

orkflowDocumentServiceImpl :: routeDocument: org.kuali.rice.kew.routeheader.DocumentRouteHeaderValue@54965496[

  routeHeaderId=2150801

  documentTypeId=320823

  docVersion=1

  docTitle=Electronic Invoice Reject Document - PO: 356801 Vendor: Fisher Scientific Co

  createDate=2012-11-16 05:31:16.0

  initiatorWorkflowId=1

  routedByUserWorkflowId=1

  docRouteStatus=R

  routeStatusDate=2012-11-16 05:32:59.639

  statusModDate=2012-11-16 05:32:59.639

  docRouteLevel=0

  routeLevelDate=<null>

  approvedDate=<null>

routeLevelDate=<null>

  approvedDate=<null>

  finalizedDate=<null>

  appDocId=<null>

 

2012-11-16 05:32:59,765 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Saving Invoice Reject for DUNS '150982189'

2012-11-16 05:32:59,765 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.kns.document.DocumentBase :: i

nvoking rules engine on document 2150806

2012-11-16 05:32:59,803 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.kns.document.DocumentBase :: [

document.invoiceOrderReferenceDocumentReferencePayloadIdentifier] error.format.org.kuali.rice.kns.datadictionary.v

alidation.charlevel.AnyCharacterValidationPattern(Invoice Order Reference Document Reference Payload Identifier (I

dentifier))

2012-11-16 05:32:59,803 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.kns.document.DocumentBase :: [

document.invoiceOrderReferenceOrderIdentifier] error.format.org.kuali.rice.kns.datadictionary.validation.charlevel

.AnyCharacterValidationPattern(Invoice Order Reference Order Identifier (Identifier))

2012-11-16 05:33:00,022 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.ksb.messaging.serviceproxies.M

essageSendingTransactionSynchronization :: Message [RouteQueue: , routeQueueId=null, ipNumber=129.82.127.238servic

eNamespace=KFS, serviceName={KFS}SearchableAttributeProcessorService, methodName=indexDocument, queueStatus=R, que

uePriority=30, queueDate=2012-11-16 05:30:34.059] not sent because transaction not committed.

 

2012-11-16 05:31:23,463 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2150806)

2012-11-16 05:31:23,464 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 150982189_8052239899_10047594884081591.xml has been rejected

 

 

I found the offending xml file, renamed it as BAD, removed all the processed files & restarted the ABORTED job & it finished successfully.

Dermot.

 

 

 

 

Aborted Module Name: ADMSBSLT.LYNX_01

  Date:        Day:      Time:          Resolution:

11/16/12     Fri         22:30           See follow up below.

 

Error log and follow up comments:

 

STATUS=HTTP/1.1 200 OK

   URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

DO NOT restart this Process Flow without the OK from the users. (Kathy Banister, Marcella Vininski)

The user may have reran the aborted LYNX job manually.

If the LYNX module was reran by the user, delete the job from AppMan.

 

Marcella,

Do you want us to restart this chain?

Vicki.

 

I’m still trying to figure out what the error was and why it aborted.  Please let me look into this a little further before you rerun the program. 

Marcella.

 

Here is some more info from the lynx stdout file:

It looks like maybe it was having connection issues?

        <title>The remote server returned an error: (425) Can't open data connection.</title>^M

        <style>^M

         body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;} ^M

         p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}^M

         b {font-family:"Verdana";font-weight:bold;color:black;margin-top: -5px}^M

         H1 { font-family:"Verdana";font-weight:normal;font-size:18pt;color:red }^M

         H2 { font-family:"Verdana";font-weight:normal;font-size:14pt;color:maroon }^M

         pre {font-family:"Lucida Console";font-size: .9em}^M

David.

 

Let’s go ahead and rerun this then if you think it won’t cause any additional problems. 

Marcella.

 

Though the ADMSBSLT.LYNX_01 completed on Appman, Gudrun did notice an error in the log file (highlited below). Can you verify that this worked okay on your end?

   URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

   URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

David.

 

The file was created and it look good so not sure why we received this error.  I need to do some further research on why we are still getting this error.  Do you know if the file layout was recently changed? 

Marcella.

 

Thanks for checking. The error may be from the previous run.

David.

 

I see the problem there is a duplicate in the file that has two different Slate ID’s.  We are having trouble with Slate creating duplicates on their end and are working on resolving this issue with them. In the meantime I’ll clean up his Banner and Slate record so we don’t have this issue with his record, though we may see this again with other duplicates that Slate is creating.

Marcella.

 

 

 

Aborted Module Name:   AREGSPWD.FTPS_NEW_PASS_01

  Date:        Day:      Time:          Resolution:

11/21/12     Wed       10:25           See follow up below.

08/28/14     Mon       09:35           Restarted by Joleen.

 

Error log and follow up comments:

 

11/21/12.

# CMDOUT #  [331 Send password please.]

# CMDOUT #  [530 PASS command failed]

# CMDOUT #  [530 You must first login with USER and PASS.] # CMDOUT #  [DONE]

 

Jerry,

Is this a problem with the new password, or something else?

Steve. G.

 

I'm guessing the current password didn't work?  Let me know if I need to have the state re-set it.

Jerry.

 

The old password should still work -- do you have a method to manually log in and verify that the old one still works?

 

If you can't log in with the old password, then we will need to have the State reset it.

 

Can you ask them why we're having problems with it and see if they require us to keep the same password for a certain amount of time?

Elden.

 

I don't have direct access to their system so I can't actually log in myself.  But I'll certainly ask if we need to keep a password a minimum amount of time.

Jerry.

 

 

# CMDOUT #  [331 Send password please.]

# CMDOUT #  [530 PASS command failed]

# CMDOUT #  [530 You must first login with USER and PASS.] # CMDOUT #  [DONE]

 

I see what I did - the new password I created had a repeating character.  I'll change it.

Jerry.

 

07/28/14.

 

# CMDOUT #  [331 Send password please.]

# CMDOUT #  [530 PASS command failed]

# CMDOUT #  [530 You must first login with USER and PASS.]

 

Ok - I called the DOR and they've reset the password to what I had submitted via AppMan.

Jerry.

 

 

 

 

Aborted Module Name:   ODSRAROS.ODSRS003_01

  Date:        Day:      Time:          Resolution:

12/01/12     Sat          00:04           See follow up below.

01/01/13     Sun        01:57            Restarted by Joleen (see followup).

 

Error log and follow up comments:

 

12/01/12.    

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUT_AR_SMR_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 21

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

Here is the error I found:

CSUT_AR_SMR_FRZ       ORA-08103: object no longer exists  01-DEC-12       CSUBAN

Same error as Nov.1, the table is there and I can select from it.  The mapping did run but must have lost its connection before it finished. 

Please re-start the job and I will create a Clarity ticket for it on Monday to look into the problem more.

Mark P.

 

12/01/2012 12:48    MPAQUETT

Error with OWB mapping LOAD_CSUT_AR_SMR_FRZ

ORA-08103: object no longer exists Table does exist in ODS Prod. 

I have ask Robin to run the job again and notify me it errors again.

Had same error on Nov. 1 run, will create a Clarity ticket to look into problem further.

 

01/01/13.   

ORA-20000: ERROR running LOAD_CSUT_AR_TERM_DATA_FRZ

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 18

 

I called the DBA cell a few times and I left a message a few minutes ago with Mark P.

Joleen.

 

It looks like a data problem, here is the error I am seeing.

481         CSUT_AR_TERM_DATA_FRZ         ORA-01427: single-row subquery returns more than one row     01-JAN-2013 01:01:47 AM

Mark P.

 

Do you need me to delete this component? It looks like we might need to have the stakeholders to take look?

Joleen.

 

 

 

 

 

 

Aborted Module Name:  AGENDYGN.AGENS006_01

  Date:        Day:      Time:          Resolution:

12/03/12     Mon        19:01          Restarted by Joleen.

 

Error log and follow up comments:

 

11378385 DX   01-JAN-00 Address update needed                                                  

Error: ORA-20100: ::Hold from date must be less than or equal to hold to date::

 

ERROR at line 1:

ORA-20100: ::Hold from date must be less than or equal to hold to date::

ORA-06512: at line 1886

 

19:01:10 1884      utl_file.fflush(file_handle);

19:01:10 1885      utl_file.fclose(file_handle);

19:01:10 1886      raise; -- reraise the exception

 

 

The problem is with the end dates on addresses (12/31/2099)

When the most future Mailing Address has an end date on it, we place a hold (DX) on that person starting on the day after the mailing address ends – in this case 12/31/2099 + 1 day = 01/01/2100,

We should not be putting end dates on addresses 87 years in the future!

 

PIDM

CSU ID

Last Name

First Name

Middle Name

11378385

829995699

Li

Yinan

 

 

Karen,

I see you putting many future dated - end dates on addresses that I believe are unnecessary.  Many are on RA addresses.

This causes serious problems when they are placed on MA addresses.

 

Can someone please remove this to date for this person’s Mailing Address and let us know so that we can restart the schedule?

Vicki.

 

I removed the To Date on both the mailing and RA addresses for the student below.  Will you be able to locate and replace any others I may have done?   So sorry, I will leave the address To Date blank going forward!

Karen.

 

 

 

 

 

Aborted Module Name:   HOUSADDR.AGENS016_01

 

  Date:        Day:      Time:          Resolution:

12/28/12     Fri         17:01           Deleted by Joleen.

12/31/12     Mon      17:01           Resubmitted by Joleen.

08/04/14     Mon      10:34           Resubmitted by Joleen.

 

Error log and follow up comments:

12/28/12.

Problem Inserting the Address record for:                          

A,11301960,HA,"C317 Summit Hall",,,Fort Collins,CO,805215244,,20121228,20130517,HOUS,,,,,RMS

 error is: ORA-20302: An address cannot be added with the same from_date as an existing address.

 

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 316

 

17:01:20 314 

17:01:20 315         if v_api_count > 200 then

17:01:20 316             raise_application_error(-20500,'Error count Exceeded 200');

17:01:20 317         end if;

 

I manually ran this process early in the day Friday to create the new file for the spring semester and I thought I had shut down the automated process.  Instead, it looks like the automated process also ran and created a duplicate file.

The combined files were sent over to kebler and it looks like they choked and died.  I'll re-create the file and send it over tonight.

Greg Fend.

 

I have deleted the job from Friday. Tonight's job will pick up your re-created file.

Joleen.

 

12/31/12.    

FYI - I bet the address job aborted again on Monday night because the old address file was still on kebler.  I've deleted the old file and uploaded the correct one.

Is it possible to re-run the job to load these addresses?

Greg Fend.

 

I submitted HOUSADDR_ASSIGNMENT_EXPORT and it has finished running.

Joleen.

 

08/04/14.

ERROR at line 1:

ORA-20500: Error count Exceeded 200

ORA-06512: at line 316

I have attached the utl file which shows the erorrs

 

Thanks Joleen.  It looks like the file may not be formatted correctly.

I'll take a look and see what I can discover.

Greg.

 

 

Aborted Module Name:   FAIDEPLS_OD.LYNX_02

  Date:        Day:      Time:          Resolution:

01/04/13     Fri         10:06           Restarted by Joleen.

 

Error log and follow up comments:

 

 

The file you sent me said that the error was here:

pidm = (decimal)rdrGetPop[“pidm”];

fund_code = (string)rdrGetPop[“fund_code”];

offer_amt = (decimal)rdrGetPop[“offer_amt”];

 

but that can’t be because that’s not the code in the file (I removed the “(decimal)” cast text). The code in the file is this:

pidm = rdrGetPop["pidm"].ToString(); //prod

fund_code = (string)rdrGetPop["fund_code"];

offer_amt = (decimal)rdrGetPop["offer_amt"];

 

This means that either the lynx procedure doesn’t point at WSNET, WSNETDEV, or its storing the page in some sort of cache and not requesting new page content from the server…

Zach Garno.

 

Here is the URL it is using:

http://wsnet.colostate.edu/cwis231/onet/eplus/faidepls_api_rprawrd.aspx?ay=1213&treq=OD   Correct?

I believe the LYNX step runs the latest code each time. Is this not true?

This was re-started at 07:49 this morning. Has your code changed since then? Do you want us to try again?

David.

 

Please try to re-run the job.

Zach.

 

The first LYNX step completed successfully. The second LYNX step has now failed. I have attached the LYNX standard output.

Just an FYI, the standard output file appends so the latest info will be at the bottom of the file if we have to rerun.

David.

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS001_01

  Date:        Day:      Time:          Resolution:

01/16/13     Wed        03:00          Restarted by Dermot but ABORTED again @ 8am, see below.

 

Error log and follow up comments:

 

ODSRAGEN.ODSRS001_01.9701248.9701250.00.2013_01_16_0004.jobo

ut 100

no output from ODSRAGEN.ODSRS001_01

+ err=100

 

I checked this CRITFAIL & did not find any similar errors in our ABORT logs.

I called and left a message on the DBA cell.

 

AS per Craig, I restarted ODSRAGEN.ODSRS001_01 @ 03:44.

Dermot.

 

The mapping DELETE_MST_GENERAL_STUDENT failed last night with the following error.

Error Msg: ORA-01555: snapshot too old: rollback segment number 19 with name "_SYSSMU19_2294121418$" too small

Which caused the UPDATE_MST_GENRL_STDNT_STEP_2 to fail with a unique constraint error because the delete was did not complete.  This is the same error we ran into last week.  I will work on adjusting the rollback segment to help eliminate this problem in the future.

Also note there appears to be a larger than normal download of data hitting the ODS today, so expect things to run longer than average.

Mark P.

 

It looks like the UPDATE_MST_GENRL_STDNT_STEP_2 has stalled out.  It has been running since 3:26 this morning and usually runs in 15 mins (give or take).

I did work with Mark B to get the rollback segment adjusted on ODS Prod.

I am going to kill the process on the Oracle side and would like to have it restarted via Appman.

Mark P.

 

The UPDATE_MST_GENRL_STDNT_STEP_2 mapping completed and things are still moving…

Mark P.

 

 

 

 

Aborted Module Name:   FAIDSAIG_EV.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

02/02/13     Sat          02:06           See follow up below.

 

Error log and follow up comments:

I verified that component aborted while trying to receive ISIR files. No files were received !

# 20130202-000601 : pipe_exec                 | cmdout = <WARNING: Failed to connect to server>

# 20130202-000601 : pipe_exec                 | cmdout = <Error connecting to network SAIGPORTAL>

# 20130202-000601 : pipe_exec                 | cmdout = <(-1) FTP connection attempt failed.>

# 20130202-000601 : pipe_exec                 | cmdout = <Connection refused

I attempted reset around 2am again but same result – connection failed again. Should have give them time to troubleshoot.

Server on their side is still having issues or password incorrect ? It needs to be followed up during the daytime. At the moment I don’t know whom to call on their side. Also did check that we did not change the password yesterday. Negative.  FAIDSPWD did not run yesterday.

Critical AppMan job component FAIDSAIG_EV.TDCLIENT_01  failed last night. We are investigating. Connection attempts to SAIGPORTAL are being refused.

Gudrun.

 

Password issue? Did FAIDSAIG_OD run successful? SAIGPORTAL may be down??

Phil.

 

To confirm 100% that SAIGPORTAL is down I am contacting Rich. I need access to quartz. An ftp connect attempt from the command line should support conclusion drawn that SAIGPORTAL is down.

I checked with Rich and information received supports that SAIGPORTAL is down. FTP attempt from quartz to server SAIGMAILBOX.ED.GOV times out.  Any ping command as well. However, latter might get blocked. Unless server access is restored production FAID schedule started yesterday won’t complete this weekend. I think Candy Chapman needs to be contacted; however, I believe you would prefer doing this. Let me know if I can help with anything else.

Gudrun.

 

Candy,

I just would like to inform you that we won’t be able to complete the FAID schedule this weekend

UNLESS access to SAIGPORTAL server SAIGMAILBOX.ED.GOV gets re established.

Last night FAIDSAIG_EV.TDCLIENT_01 aborted with ftp error – connection refused. Several reset attempts after midnight failed since then with the same error. The AROS schedule may not complete this weekend due to a delay in the FINAID schedule. Access to a state server has not been restored at this point in time. The server refuses connections. I sent out email alerts also to CLMS and AROS IS DL email lists to alert them of their schedule being delayed. Emailed Candy Chapman. I don’t have her phone number. But do think Phil needs to call. He has been contacted and responded earlier to an email.

Gudrun.

 

I found an old Email with Candy Chapman's number and then called Vicki to verify that it was okay to call her. Vicki said to do so. I called Candy and she said that the SAIG server is down because of a planned outage. Candy was going to cancel the affected FAID jobs but had forgotten. She is also on the road now, but will send an Email of jobs we can turn the flags off so they don't run. David.

 

I apologize. Karma thought she had canceled the affected jobs, but she thought we just couldn't SEND stuff to SAIG. She hadn't thought about picking stuff up from them. So we think we might as well cancel the entire schedule. She thinks they will be down until sometime tomorrow. We'll figure it out on Monday.

Candy.

 

I called Josh. I was not sure and he needs input regarding if it is ok to release CLMS process flow CLMSDISB_FINAID_DISBURSEMENTS. Dependent AROS process flows were released earlier this morning after the FAID schedule got deleted. Remaining backlog FAID schedule process flows got deleted except for FAIDAM99.   AROS schedule completed as scheduled – no deletions. Rob did ok for remaining AROS process flows to run despite FAID schedule deletion. CLMS schedule is STILL on hold. Waiting on ok from Phil that CLMSDISB_FINAID_DISBURSEMENTS can be released.  

Gudrun.

 

Phil called, I missed his call but called him back. He said to delete CLMSDISB but allow CLMSDATA to run. I did this and the CLMS schedule is complete,

David.

 

 

 

 

Aborted Module Name:   FAIDTRAK_OD.GLBDATA-LOOP_02

  Date:        Day:      Time:          Resolution:

02/14/13     Thu         00:51           See follow up below.

 

Error log and follow up comments:

 

02/14/13.

230 Error Timed out waiting for response on client pipe 240.

+ iteration_done=yes

+ [[ yes = no ]]

+ spawned_module_name=GLBDATA

+ . SRC_APMX_STATUS_FOR_SPAWNED.KSH

+ set -x

+ awexe jh

+ egrep ABORTED|CRITFAIL|C-Error

+ grep 9880772

+ print FAIDTRAK_OD_DT_HOLD

+ 1>> /ais01/dat/work/prod/FAIDTRAK_OD.GLBDATA-LOOP_02.selections_done

+ awexe upd_var_value subvar=#glbdata_iterations_9876682 var_value=21 flag=Y

 

A UC4 ticket was created for prod error: “230 Error Timed out waiting for response on client pipe 240.” UC4 Ticket #211836

Subject: 230 Error Timed out waiting for response on client pipe 240.

Priority Level: 2  (1highest – 4 lowest)

230 Error Timed out waiting for response on client pipe 240.
Last week we have observed the above *Time out* error in our production instance twice. Our test instance also encountered the same issue.

A general slowdown is being observed in both instances.
Could you suggest a troubleshooting strategy and possible tuning parameters we could point out to our system staff ?
Gudrun.

 

 

 



 

Aborted Module Name:   HRMSS230.HRMSS230_01

 

  Date:        Day:      Time:          Resolution:

02/16/13     Sat          07:21           See follow up below.

03/16/13     Sat          06:50           See note from Gudrun below.

 

Error log and follow up comments:

 

HRMSS230.HRMSS230_01 is in EMPTY_EXTR status:

 

Abort status was set by a check file condition. The condition is looking for {#utl_file1_{chain_id}}. I do see a utl file1 for HRMSS230.HRMSS230_01 out in /orautl/hrprod but it doesn?t have the chain id. I’m not sure where the job is looking for the utl file1. Someone smarter than me can figure this one out.

 

I checked this out -- as Joleen said, the utl file was indeed present, so I tried restarting the aborted job and it finished successfully.  Maybe just some weird timing issue?

Steve. G.

 

02/16/2013 10:37    JWEARNE

I logged in earlier today and noticed 2 aborts.

FAIDTRAK_OD.LYNX_01 and HRMSS230.HRMSS230_01. I emailed Candy for the FAID abort. She came in and fixed their Webpage and had me restart. Their schedule is progressing now. She would like to be emailed if there are any more LYNX aborts.

 

03/16/13.

aborted with status EMPTY_EXTR. CHECK_FILE AFTER condition failed to detect file HRMSS230.HRMSS230_01.utl_file in /orautl/hrprod.

Workaround: PL/SQL completed. No reset. Manually complete the last two conditions before deleting component.

 

This abort got resolved. Before deleting the component I copied the generated file to

/ais01/ftp/to/user/HRMSS230.VSP.DAT

/userfiles/Uhrben/data/HRMSS230.VSP.DAT.

 

Process flow HRMSS230 completed shortly afterwards.

Gudrun.

 

 

 

 

 

Aborted Module Name:   AREGTTRN.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

02/18/13     Mon        00:20           Restarted by Joleen.

05/08/13     Wed        03:20           Restarted by David.

09/10/13     Tue         11:10           Restarted by David.

 

Error log and follow up comments:

 

02/18/13.

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > ssh: connect to host iwantmytranscript.com port 22: A remote host did not respond within the timeout period.

***

*** END SEARCH OF FTP JOBLOG FOR ERROR STRINGS

 

I saw several references to this type of error in the abort log document, and in most cases the job was just restarted.  I tried this and it has finished successfully.

Joleen.

 

05/08/13.

# > ssh: iwantmytranscript.com: Hostname and service name not provided or found # > Connection closed # > (255) #==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

 

I re-started after checking that no files had been received.

David.

 

09/10/13.

Something not ok with AREGTTRN.SSH_SFTP_01. Its driver is empty.

Gudrun.

 

I killed this process. It will retry at 11:20

AREGTTRN.SSH_SFTP_01 appears to be working now.

David.

 

 

 

 

 

 

Aborted Module Name:   FAIDTKNT_OD.LYNX_01

  Date:        Day:      Time:          Resolution:

02/28/13     Thu        03:00           Restarted by Joleen.

03/01/13     Fri          03:02           Restarted by Joleen.

 

Error log and follow up comments:

 

FAIDTKNT_OD.LYNX_01 aborted. I saw your email about a new URL. I replaced the URL with the new one below and restarted. It aborted again. Below is the standard output from the second time it aborted. The message from the first abort was: You have requested a page that either never existed or no longer exists on this web server

http://wsnet.colostate.edu/cwis231/autorun/JobChain/parent_tracking_notification.aspx?ay={#2}

 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>

<head>

               <title>Untitled</title>

</head>

<body>

<img src="http://wsdev.colostate.edu/gld_fr_med_wh.gif"><BR><BR>

<font face="Arial">

You have requested a page that either never existed or no longer exists on this web server.<BR><BR>

The web page you are visiting is part of a larger site maintained by a department

at <a href="http://www.colostate.edu">Colorado State University</a>.<BR><BR>

If you came to this page via a "bookmark", this page may have been moved.  You may be able to rectify this

error by contacting the webmaster of this website by going to the main site page and finding any contact information on that page.  Otherwise, visit <a href="http://www.colostate.edu">CSU's main website</a>, and use directory search functionality to contact the department responsible for this website.

 

I'm sorry. I thought the original URL would still work until you could make the change. I'll fix it as soon as I get in  - just a few minutes.

Candy.

 

03/01/13.

    <head>

        <title>ORA-20100: ::Cannot create, record already exists::<br>ORA-06512: at &quot;BANINST1.CSUG_API_GLBEXTR&quot;, line 287<br>ORA-06512: at line 1</title>

        <style>

         body {font-family:"Verdana";font-weight:normal;font-size: .7em;color:black;}

         p {font-family:"Verdana";font-weight:normal;color:black;margin-top: -5px}

 

 

The error means there’s already a record in the GLBEXTR for the ID, so it won’t insert one twice.  I’m not sure why, but let me take a look and see if I can tell which is the duplicate record.

Candy.

 

Has anyone heard anything about what to do with aborted LYNX job FAIDTKNT_OD.LYNX_01 ?

Gudrun.

 

I’m working on it. / Candy fixed the LYNX for us. (Thank you, Candy! You made my day!) FAIDTKNT_TRACK_NOTIFICATION has finished running.

Joleen.

 

 

 

Aborted Module Name:   ODSAROS.ODSRS003_01

  Date:        Day:      Time:          Resolution:

03/01/13     Fri          11:00           Killed & Restarted by Gudrun.

 

Error log and follow up comments:

 

 

We have a refresh that has been running 11 ½ hours. It is: REFRESH_FV_AR. Would it be possible to check it out and make sure that everything is OK and that it is not hung up?

Joleen.

 

It is the LOAD_CSUT_AR_SMR_FRZ, which has caused problems before.  It either runs in under 2 hours or hangs. 

Please kill and restart, I will try to keep an eye on it.   

LOAD_CSUT_AR_SMR_FRZ         00: 00: 00            RUNNING                               01-MAR-2013 01:03:51   01-MAR-2013 01:03:51

Mark. P.

 

I killed and restarted process flow component ODSAROS.ODSRS003_01.

Others out for lunch.

What a Friday ?! … --- still one thing out there to tackle --- ODSRAROS

Do you guys prepare for running overtime for AROSAM99 ?

Gudrun.

 

I just spoke to Joleen about ODSRAROS.  If it is still running this evening, she said she would call the DBA cell and have them look at it again.  We can also force AROSAM99 to finish without ODSRAROS being done so that we can get tonight’s AROS schedule going, which is probably what we will do.

Steve.

 

What are your thoughts about the LOAD_CSUT_AR_SMR_FRZ? We restarted it 3 hours ago and it is still running. We’re trying to formulate a plan for tonight’s schedule.

Joleen.

 

Mark B. called me about the two csuban jobs running and causing problems.  One of the jobs had been running for over 14 hours, which probably didn’t die after we re-started the job this morning (nice catch Mark B.)

I killed it and hope this allows the current one to finish. 

Mark P.

 

Mark P called. The refresh is processing and is not hung. He will check again in a couple hours.

Joleen.

 

The AR FRZ job is done!

01-Mar-2013 22:37:35  ODSRAROS

ODSRAROS_REFRESH_AROS_ODSPROD

Glad to have that obstacle out of the way for the Banner Agent upgrade on Sunday!

Joleen.

 

 

 

 

 

 

 

 

Aborted Module Name:   ADMSSRLD_DY.SQLSURLOAD-LOOP_01

  Date:        Day:      Time:          Resolution:

03/05/13     Tue         22:24           See note from David below.

 

Error log and follow up comments:

 

 

+ print ABRD_BRSPI

+ 1>> /ais01/dat/work/prod/ADMSSRLD_DY.SQLSURLOAD-LOOP_01.sqlsurload_done

 

+ awexe upd_var_value subvar=#sqlsurload_iterations_10003798 var_value=1 flag=Y

ERROR -999 ORA-06502: PL/SQL: numeric or value error: character string buffer too small

 

ORA-06512: at line 1

ORA-06512: at "APPWORX.AWDYN", line 23

ORA-06512: at line 1

 

 

I fixed the variable size. I changed #sqlsurload_iterations_{chain_id} to #sqlsurload_iter_{chain_id} in the SQLSURLOAD-LOOP.KSH

David.

 

 

Aborted Module Name:   ODSRFAMS_REFRESH_FAMS_ODSPROD

 

  Date:        Day:      Time:          Resolution:

03/12/13     Tue         23:22           Restarted by Joleen.

 

Error log and follow up comments:

 

ORA-20000: ERROR running LOAD_FAMIS_DEPT_SPACE_FUNC

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 288

 

23:22:54 287  WHEN 'REFRESH_FAMIS_CSU' THEN

23:22:54 288    csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_DEPT_SPACE_FUNC');

23:22:54 289    csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'LOAD_FAMIS_EMP_SPACE_DEPT');

23:22:54 290    csug_run_owb_task('OWBREP', 'ODS_CSUFAMIS_LOCATION', 'PLSQL', 'L_SPACE_BUILDING_T');

23:22:54 291  WHEN 'REFRESH_HR_GRAD_ASST_CSU' THEN

23:22:54 292    dbms_mview.refresh('CSUBAN.CSUH_GRAD_ASST_APPROVALS_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

23:22:54 293  WHEN 'REFRESH_SALX_CSU' THEN

23:22:54 294    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_APPT_TYPE_T');

23:22:54 295    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_CUR_FY_GP_T');

23:22:54 296    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_CUR_FY_JA_MTH_T');

23:22:54 297    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_EMPLOYEES_T');

23:22:54 298    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_EMPL_TYPE_T');

23:22:54 299    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_HRDEPT_T');

23:22:54 300    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_JOB_CLASS_T');

23:22:54 301    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_REF_CODES_T');

23:22:54 302    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_SECURITY_T');

23:22:54 303    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_UNIT_CODE_T');

23:22:54 304    csug_run_owb_task('OWBREP', 'ODS_CSUHR_LOCATION', 'PLSQL', 'LOAD_CSUH_SALX_USERS_T');

23:22:54 305  ELSE

23:22:54 306    DBMS_OUTPUT.PUT_LINE ('INVALID REFRESH NAME');

23:22:54 307  END CASE;

23:22:54 308 

23:22:54 309    csug_ods_refresh.log_end_time(mat_view_type);

23:22:54 310 

23:22:54 311  end;

 

There was a locked account on FMSPROD that caused the job to fail. 

Error:

"ORA-28000: the account is locked

ORA-02063: preceding line from FMSPROD@FAMIS_LINK"

The account has been unlocked and I asked David to restart the job.

Mark. P.

 

 

 

Aborted Module Name:   KFSXTXEF.CHAIN_FINISH_01

  Date:        Day:      Time:          Resolution:

03/15/13     Fri           17:01          Deleted by Dermot & re-ran successfully on the 21st March.

 

Error log and follow up comments:

 

+ 1>> /ais01/dat/work/prod/KFSXTXEF.CHAIN_FINISH_01_10072226_jobstat

+ cat /ais01/dat/work/prod/KFSXTXEF.CHAIN_FINISH_01_10072226_jobstat

*** SEARCH OF LAST JOB (10072228.00) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

                     A file or directory in the path name does not exist.

 

Componet KFSXTXEF.CHAIN_FINISH_01 aborted due to a runhost log error in file /ais01/joblog/runhost_10072226_AWPROD.log. File ORIG.29A52*txt  could not be moved to bkp.

Cause:

It seems the file was not picked up correctly in the first place because BEFORE SEND MAIL check file condition checks for the file in a wrong directory. Cause for the faulty check is subvar /{#apmx_last_yyyy}/ which has value 2012 when in fact it should probably be 2013 since the file can be found in dir {#kfsx_staging_{chain_id}}/tax/2013

 

{#kfsx_staging_{chain_id}}/tax/{#apmx_last_yyyy}/ORIG.29A52*txt.

FYI

BOTH SEND_MAIL_01 and SEND_MAIL_02  need to be re-run in entirety I assume to complete process flow correctly UNLESS the KFSX_JAVA component should have been run

For tax year 2012. (This needs to be verified)

 

KFSXTXEF.CHAIN_FINISH – a subvar has value 2012 rather than 2013 with the end result that CHAIN_FINISH did abort.

Josh and Dermot have been informed about it. Troubleshooted it on Saturday but decided not to fix. If 2013 should have been the year used for processing

Then the two SEND_MAIL have to looked at. The evaluation done there assumed I believe also 2012.

Gudrun.

 

Here is the Parameter that is set to 2013.

 

Mike.

 

Here is the SQL that can be used to extract that data. We need to involve BFS on this change.

select txt

from krns_parm_t

where nmspc_cd = 'KUALI-TAX'

and parm_dtl_typ_cd = 'PayeeMasterExtractStep'

and parm_nm = '1099_REPORTING_PERIOD'

Josh.

 

 

Aborted Module Name:   FAIDSAIG_EV.TDCLIENT_01

  Date:        Day:      Time:          Resolution:

03/16/13     Sat           00:04          Restarted by Gudrun.

08/16/13     Fri           00:05          Restarted by Joleen.

 

Error log and follow up comments:

 

03/16/13.

Checked log. Clean abort. ftp failed right from the start. Logged on to quartz to test connection to saigportal server.  Simple ping as well as ftp test to server failed.

Checking again in the early morning hours plus contact FAID team ...

TDCLIENT aborted in AWPROD.  Find attached the log file for the aborted TDCLIENT job. SAIGPORTAL server  is not responding to ftp connection attempts when logged on to quartz as jobprd.  At this point no resolution of the issue is possible. Server simply is not responding to ftp.

AWPROD FAIDSAIG_EV.TDCLIENT_01 aborted in the early morning hours due to a ftp connection failure. Since then repeated attempt to initiate successfully a ftp connection from our quartz server to SAIGMAILBOX.ED.GOV have failed.  Resolution will have to wait until the service is operable again on their side.

Phil responded back. He will contact FAID staff depending on who can be reached. Candy is out on vacation it appears. He will call me back.

Looking at backlog and the number of aborts out there and the chance of all of them being resolved a bit slim by tomorrow morning I propose to postpone SP10 patching until next Sunday March 24. KFS abort already needs to wait until Monday

TDCLIENT abort got resolved. Reset of component was successful this morning. 

Gudrun.

 

08/16/13.

********** Start Communications Session

Connecting to server SAIGPORTAL...

FTP connection attempt failed.

Connection timed out

Error connecting to network SAIGPORTAL

(-1) FTP connection attempt failed.

Connection timed out

 

Termination started...

Disconnecting...

********** End Communications Session

 

I know that FAIDSAIG can be tricky. I checked prompts, conditions, the abort log, I read through the process flow documentation. I looked at notes I had taken on errors that were OK to restart. This error fell in that catagory. I restarted FAIDSAIG_EV.TDCLIENT_01.

FAIDSAIG_TDCLIENT_FILE_INPUT has finished running.

Joleen.

 

 

 

 

 

 

 

Aborted Module Name:   FAIDALCT.LYNX_01

  Date:        Day:      Time:          Resolution:

03/18/13     Mon        07:07          Restarted by Dawn.

 

 

Error log and follow up comments:

 

The current error page you are seeing can be replaced by a custom error page by modifying the &quot;defaultRedirect&quot; attribute of the application&#39;s &lt;customErrors&gt; configuration tag to point to a custom error page URL.<br><br>

 

I'm out of town today. Usually you can just restart the LYNX step that failed and it'll often run OK. I don't think anything has changed with that one. If you need more help today Zach will try to help.

Candy.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Aborted Module Name:   HRMSSAL0.RUNGEN_01

  Date:        Day:      Time:          Resolution:

03/19/13     Tue        10:30           See follow up below.

 

Error log and follow up comments:

 

 

HR_6881_HRPROC_ORA_ERR

SQLERRMC ORA-01841: (full) year must be between -4713 and +9999, and not be 0

 

SQL_NO 1539

TABLE_NAME PER_TIME_PERIODS

APP-PAY-06881: Error ORA-01841: (full) year must be between -4713 and +9999, and not be 0  has occurred in table PER_TIME_PERIODS at location 1539

 

Cause:        an oracle error has occurred.  The failure was reported on table PER_TIME_PERIODS at location 1539 with the error text ORA-01841: (full) year must be between -4713 and +9999, and not be 0

 

Action:        Please contact your support representative.

 

I have no idea what the problem is here, the pay periods are set up through Fiscal 2014 even and encumbrance have been using them all year.  Chris said Steve is out, so she is sending this on to Bob.

Vickie S.

 

You have probably already figured this out but the last time this happened there was a value set tied to the conc program parameter that was date based and it only looked out 60 days.  I needed to change it to 120 days.  I think this value set just needs to be tweeked.

Steve H.

 

 

 

Aborted Module Name:   FAIDALIM.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

07/08/13     Mon       07:13           Restarted by Joleen.

 

Error log and follow up comments:

 

 

cat: 0652-050 Cannot open /ais01/dat/work/prod/FAIDALIM_DRIVER1.DAT.

cat: 0652-050 Cannot open /ais01/dat/work/prod/FAIDALIM_DRIVER1.DAT.

***

*** END SEARCH OF LAST JOB (10913846.01) AFTER CONDITIONS RUNHOSTLOG FOR ERROR STRINGS

 

The file does not exist. It was supposed to be created by a BEFORE condition on FAIDALIM.SSH_SFTP_01. RUN HOST COMMAND: {#logrunhost}; touch {#workdat}/FAIDALIM_DRIVER1.DAT; touch {#workdat}/FAIDALIM_DRIVER2.DAT.

I’m not sure how to proceed. There is a source file from SCH05FO@ftp.elmproduction.com:/mailbox/COLOSTAT/INBOX/*.DS2.

I’m not sure if information where the information from this source file is and does it need to be contained in {#workdat}/FAIDALIM_DRIVER1.DAT?

FAIDALIM.SSH_SFTP_01 status is empty; I’m assuming I can create an empty {#workdat}/FAIDALIM_DRIVER2.DAT. for that one.

Joleen.

 

 

Aborted Module Name:   ODSRHRMS.ODSRS002_01

  Date:        Day:      Time:          Resolution:

04/04/13     Thu        05:44           Restarted by Robin.

 

Error log and follow up comments:

 

 

**** NOT_FINISHED FEEDBACK - FOLLOW-UP REQUIRED - COMPONENTS NOT FINISHED ****

**** NOT_FINISHED FEEDBACK - FOLLOW-UP REQUIRED - COMPONENTS NOT FINISHED ****

Chain Name                    Chain Component Name Status       Date/Time       

----------------------------- -------------------- ------------ -----------------

ODSRHRMS_REFRESH_HRMS_ODSPROD ODSRHRMS.ODSRS002_01 CANCELLED    04/04/13 00:05:44

                              ODSRHRMS.ODSRS002_01 CRITFAIL     04/04/13 00:05:44

**** END OF NOT_FINISHED FEEDBACK ****

 

The error was:

"ORA-08103: object no longer exists - ORA-02063: preceding line from HRPROD@HR_LINK"

The OWB job must have lost its connection.

OK to restart.

Mark. P.

 

 

 

Aborted Module Name: KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

04/15/13    Mon       05:30           Restarted by Dermot.

 

Error log and follow up comments:

 

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.10286358.10286361.00*.

 

2013-04-15 05:30:35,304 [main] INFO  org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: Mon Apr 15 05:30:35 MDT 2013 - Invoking localhost:1099/KFSXAPEI.electronicInvoiceExtractStep.10286358.10286361.00/electronicInvoiceExtractStep

Exception in thread "main" java.lang.NullPointerException

                at org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient.main(BatchJobRmiInvokerClient.java:83)

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.44#> [[ 1 > 0 ]]

<#errtrap_ssh.44#> exit 1

 

2013-04-12 05:33:24,255 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.kns.service.impl.PostProcessor

ServiceImpl :: finished handling route status change from I to R for document 2357872

2013-04-12 05:33:24,261 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.rice.kew.routeheader.service.impl.W

orkflowDocumentServiceImpl :: routeDocument: org.kuali.rice.kew.routeheader.DocumentRouteHeaderValue@3a5c3a5c[

  routeHeaderId=2357872

  documentTypeId=320823

  docVersion=1

  docTitle=Electronic Invoice Reject Document - PO: 373878 Vendor: OfficeMax Inc

 

2013-04-12 05:32:30,170 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2357872)

2013-04-12 05:32:30,171 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 606788404omax1_269486APR1113_22752585499849576.xml has been rejected

2013-04-12 05:32:30,174 [RMI TCP Connection(2)-129.82.127.238] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Processing 606788404omax1_271929APR1113_22752587318510904.xml....

2013-04-12 05:32:30,174 [RMI TCP Connection(2)-129.82.127.238] INFO 

 

Cannot find the rejected :: 606788404omax1_269486APR1113_22752585499849576.xml file!

We also had an ABORT on KFSXCS16.CHAIN_SQL_INIT_01 around the same time with ORA-12541: TNS:no listener error.

 

I checked for .processed files, there were none, Josh requested that I restart the ABORTED job, I did & it completed successfully.

Dermot.

 

 

Aborted Module Name: FAIDPMAN_OD.SEND_MAIL_RW_01

  Date:        Day:      Time:          Resolution:

04/22/13    Mon       21:06           Restarted by Joleen.

 

Error log and follow up comments:

 

SMTP recipient() command failed:

5.1.1 <cindy.heckle@colostate.edu>... User unknown

 

error is 255

 

This one was hard to figure out.  When I tried just removing cindy.heckle@colostate.edu from the SEND_MAIL_RW_01 component "Recipient" prompt in Backlog, I got a database query error of "value too large for column".  What finally worked was to delete the entire component prompt value in Backlog, save that change, and then go back in and repopulate the prompt minus  cindy.heckle@colostate.edu , and restart the component, which finished successfully.  Then the same failure occurred with the SEND_MAIL_RW_02 component, and I used the same process to get it to finish successfully.  Not sure what caused this in the first place -- cindy.heckle@colostate.edu appears to be a valid email address from what I can see.  It looks to me like the email recipient list is derived from file /userfiles/Ufaid/data/FAIDPMAN_OD.RWCLIENT-EMAIL_01.DAT

Steve.

 

I'm sorry about this. Cindy Heckle no longer works for us. I thought she was still on campus, but maybe she isn't. I'll get her email address out of all our lists and jobs. 

Candy.

 

 

 

Aborted Module Name:   FAIDSUNT.LYNX_01

  Date:        Day:      Time:          Resolution:

04/26/13    Fri           22:01           Restarted by Joleen.

06/15/13    Sat           04:31          See follow up note below.

Error log and follow up comments:

04/26/13.

   URL=https://wsnet.colostate.edu/cwis231/autorun/JobChain/SummerAwardEmail.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

 

Standard output file:

[SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)]

Line 34:                                                                                OR Hold IS NULL

Line 35:                                                                )&quot;, new SqlConnection(connectionBanner));

<font color=red>Line 36:                              command.Connection.Open();

</font>Line 37:                                 SqlDataReader data = command.ExecuteReader();

Line 38:                                StringBuilder result = new StringBuilder();</pre></code>

 

[[Win32Exception]: The network path was not found

[SqlException]: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

It aborted again. It is OK on our side if you want to run it manually. I can delete the LYNX module and let the rest of the job finish running. Do you need to run that manually before I let the job finish?

Joleen.

 

I don’t think we want to delete it, but just comment it out for now and see if it will run tomorrow. Yes I would need to run my page before the job runs.

Zach.

 

That is a great plan. Let me know when you are done running the page and I will comment out the LYNX and let the rest of the job finish.

Joleen.

 

Okay Joleen, I ran the page manually and when I did I found a different error.  I fixed this error and the page executed normally, so I think it will be okay the next time the job auto runs. You can continue running the job now.

Zach.

 

06/15/13.

Error: StartIndex cannot be less than zero.

 

There are no summer award emails to send, so that caused the error.  Can you force the job to finish without running the LYNX? Zach will fix it so it knows what to do if the query is empty the next time.  Sorry to cause you more work!

Candy.

 

I bypassed the LYNX. FAIDSUNT_SUMMER_NOTIFICATIONS has finished running.

Joleen.

 

 

 

 

 

Aborted Module Name:  FAIDSUMR.LYNX_01  

  Date:        Day:      Time:          Resolution:

05/02/13     Thu        22:21           See follow up below.

09/23/13     Mon       21:16           Restarted by Joleen.

04/16/14     Wed       21:16           Deleted by Joleen.

 

Error log and follow up comments:

 

   Oracle.DataAccess.Client.OracleCommand.ExecuteReader(Boolean requery, Boolean fillRequest, CommandBehavior behavior) +4168

   Oracle.DataAccess.Client.OracleCommand.ExecuteReader() +136

   Summer.getBaseCriteria(String pidms, String aidYearCode, String termFall, String termSpring, String termSummer) in e:\USERS\cwis231\wwwroot\autorun\JobChain\Summer.aspx.cs:31

   Summer.Page_Load(Object sender, EventArgs e) in e:\USERS\cwis231\wwwroot\autorun\JobChain\Summer.aspx.cs:20

   System.Web.UI.Control.LoadRecursive() +71

   System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsync

 

This is okay, I will look at it later to fix the issue, but its basically saying that there are no records to processes. We can ignore this and mark the job as complete.

Zach.

 

FAIDSUMR_SUMMER_PROCESSING has finished running.

Joleen.

 

09/23/13.    

[OracleException (0x80004005): ORA-12571: TNS:packet writer failure]

 

Can we try re running these first? This looks like a network hiccup.

Zach.

 

FAIDSUMR.LYNX_01 was restarted and finished successfully -- I then restarted FAIDTRAK_OD.LYNX_02 and it also finished successfully.

Joleen.

 

04/16/14.    

Object reference not set to an instance of an object.

 

Zach asked me to delete FAIDSUMR_SUMMER_PROCESSING from tracking for today. It will be back on the schedule tonight.

Joleen.

 

Look in chain prompts & copy the standatd output file name:

/appworx/out/LYNX_12998547.00.stdout.txt

change the .txt to html & send to Support!

Dermot.

 

 

 

 

Aborted Module Name:   AREGTTRN.AREGS620_01

  Date:        Day:      Time:          Resolution:

05/03/13     Fri         08:23           See follow up below.

 

 

Error log and follow up comments:

 

ORA-01843: not a valid month

ORA-06512: at line 130

 

08:23:36 127  --******************************************************************************

08:23:36 128  --Set the request date to the TOD file order date

08:23:36 129  --******************************************************************************

08:23:36 130    vrequest_date := to_date(n_rec.order_date,'YYYYMMDD');

08:23:36 132  --******************************************************************************

 

Just took a quick look at AREGS602 which read from SWLTTOD table.  The order_date column does look odd

42013050

52013050

32013050

92013050

92013050

92013050

 

But I think the file in general is 1 character off.  I think that input must be 1 character off at or before Order number.  See Status starts with a 3 which should be the ending number of Oder Date (2013/05/03).

 

Order Number

OrderDate

Status

Status Update

When to Deliver

              211508

42013050

3Ready For Processing

2013050309192

6now

              211508

52013050

3Ready For Processing

2013050309203

0now

              211510

32013050

3Ready For Processing

2013050309255

4now

              211517

92013050

3Ready For Processing

2013050309482

3now

              211518

92013050

3Ready For Processing

2013050309533

0now

              211528

92013050

3Ready For Processing

2013050310145

4now

 

Not sure where to go from here.

Vicki.

 

We’re seeing the same thing.  It looks like TOD has inserted one extra padding character between the Authentication and Order Number fields in the downloaded text files.  I’ll work with Matt on contacting Scrip Safe to get this resolved.

Rob.

 

Per Rob I deleted the failed AREGFQTR process flow and then re-started from the beginning in order to pick up the corrected files from Escrip-safe. It looks like it is working now and AREGS620 is complete.

David.

 

It looks like things are working correctly again.  The data has been corrected and I can see the failed transcript requests in our tables.

Matt – when you get a chance you may want to check TOD’s portal to make sure we don’t have any missing orders.

Rob.

 

 

Aborted Module Name:   KFSXCS53.KFSXS055_01

  Date:        Day:      Time:          Resolution:

05/03/13     Fri         22:23           See follow up below.

 

Error log and follow up comments:

 

 

old   3: utlpath        varchar2(255) := '&utl_path';

new   3: utlpath                     varchar2(255) := '/orautl/kfsprd';

old   4: outfile1      varchar2(80)  := '&utl_file1';

new   4: outfile1                    varchar2(80)  := 'KFSXCS53.KFSXS055_01.utl_file1';

      ,krns_nte_t  t2

 

ERROR at line 26:

ORA-06550: line 26, column 8:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 16, column 1:

ORA-06550: line 39, column 11:

PL/SQL: ORA-00942: table or view does not exist

 

Please make a note that the synonyms have been created for KRNS_NTE_T and KRNS_ATT_T.

This should never impact us, but I want to make sure that we have it noted in the chain.

The only time this would impact us is if we start running batches for Rice, KC or KPME attachments.

Josh.

 

I had to create three - KRNS_ATT_T, KRNS_NTE_T, and KRNS_DOC_HDR_T. In the consolidation we didn't create synonyms for objects that exist in both KFS and KR. Our approach was to deal with issues on a case-by-case basis. In this case I created the synonyms against the KFSUSER tables vs. the KRUSER tables.

Shawn.

 

That is correct, we were planning on these issues coming up.  Creating the synonyms was our plan. I would still like Dermot to document this, if another chain or job is created the default pointer is going to be the KFS owned table. Just want to make sure that stays on everyone's mind.

Josh.

 

 

 

 

 

Aborted Module Name:  FAIDLORC_OD.LYNX_03  

  Date:        Day:      Time:          Resolution:

05/06/13     Mon         15:20         See follow up below.

 

 

Error log and follow up comments:

 

Standard output:

<html xmlns="http://www.w3.org/1999/xhtml">

<head id="Head1"><meta http-equiv="content-type" content="text/html; charset=utf-8" /><title>

               Disbursement Amount Discrepency

</title></head>

<body>

<div id="divOK">

 

    Production page that runs in FAIDLORC.

 

Job rprlorc got recently tested and Banner agent converted by David and Joleen.

RPRLORC output looks ok to me. The Banner prompt values passed in were picked up properly according to Summary in .lis .

It seems they have a processing issue at their end.

Gudrun.

 

I restarted FAIDLORC_OD.LYNX_03 and it worked this time. Yippee!

Joleen.

 

 

 

 

 

Aborted Module Name:  ODSRAGEN.ODSRS001_01

  Date:        Day:      Time:          Resolution:

05/07/13     Tue         00:04          Restarted by Dermot.

 

 

Error log and follow up comments:

 

 

old   6:     csug_ods_refresh.log_begin_time('&REFRESH_APP');

new   6:     csug_ods_refresh.log_begin_time('REFRESH_STUDENT');

old   8:     ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

new   8:     ia_admin.mgkmap.P_RunETLMapSlots('3', job, 'REFRESH_STUDENT', NULL, '');

old  14:     csug_ods_refresh.log_end_time('&REFRESH_APP');

new  14:     csug_ods_refresh.log_end_time('REFRESH_STUDENT');

begin

*

ORA-20001: ODS Refresh Failed

ORA-06512: at line 12

 

00:04:46   5          begin

00:04:46   6            csug_ods_refresh.log_begin_time('&REFRESH_APP');

00:04:46   7            select sys.jobseq.nextval into job from dual;

00:04:46   8            ia_admin.mgkmap.P_RunETLMapSlots('&USERNO', job, '&REFRESH_APP', NULL, '');

00:04:46   9            select mdblogh_error_ind into error_ind

00:04:46  10           from ia_admin.mdblogh where mdblogh_jobno=job;

00:04:46  11           if (error_ind <> 'N') then

00:04:46  12             raise_application_error(-20001,'ODS Refresh Failed');

00:04:46  13           end if;

00:04:46  14           csug_ods_refresh.log_end_time('&REFRESH_APP');

00:04:46  15         end;

 

The error was too generic for me to understand what the issue might be so I tried phoning Mark P for more direction. He was unavailable so I phoned Mark B. Mark advised to check OWB.

I logged into the ODS IA Admin tool. I identified an issue in the UPDATE_MST_ADMISSIONS_REQUIRE mapping. A "value too large" error occurred in the MST_Admissions_Requirement TABLE on the Requirement_Comment column. All the other STUDENT mappings appeared to have run sucessfully.

I phoned Dermot and told him this was a data issue and would have to wait until morning. Immediately aftrward, Mark B. phoned back. He had checked the Ellucian support center and there was a workaround for this issue. He sent me email with SQL to change the size of the Requirement_Column to 4000 characters. I did this, phoned Dermot back and asked him to rerun the job which he is doing.

Thank you Mark Britton for the extra effort above and beyond the call of duty and thank you Dermot for your patience in the wee hours of the morning.

Shawn.

 

Here is the error:

ORA-12899: value too large for column "ODSMGR"."MST_ADMISSIONS_REQUIREMENT"."REQUIREMENT_COMMENT" (actual: 38, maximum: 30)

I will need to fix the mapping and the underlying table before we can re-start this.

On call news states that Shawn and Mark B. figured this out last night.

Mark. P.

 

 

 

Aborted Module Name:   AREGDYIR.AREGS800_01

  Date:        Day:      Time:          Resolution:

05/07/13     Tue         00:04          Restarted by Joleen.

 

Error log and follow up comments:

 

00:04:56 681  /

old 459:   lv_log_path := '&utl_path';

new 459:   lv_log_path := '/orautl/BANPROD';

old 460:   lv_log_file := '&utl_file1';

new 460:   lv_log_file := 'AREGDYIR.AREGS800_01.utl_file1';

DECLARE

ERROR at line 1:

ORA-01400: cannot insert NULL into

("SATURN"."SIBINST"."SIBINST_OVERRIDE_PROCESS_IND")

ORA-06512: at line 147

ORA-06512: at line 272

ORA-06512: at line 516

00:04:56 268          IF cur_existing_sibinst%NOTFOUND THEN

00:04:56 269            /*******************************/

00:04:56 270            /* New SIBINST record - INSERT */

00:04:56 271            /*******************************/

00:04:56 272            pInsert_SIBINST;

00:04:56 273            lv_inst_inserts := lv_inst_inserts + sql%rowcount;

00:04:56 274          ELSE   -- Cur_Existing_SIBINST%FOUND

 

Just a little bit on what I found…

I checked the 8.5.1 student release guide and there are three new columns in the SIBINST table (the sibinst_override_process_ind is required).  It looks like AREGS800 does an insert into SIBINST, but doesn’t have this new required column in the insert.  All existing rows appear to be defaulted to ‘N’.

Is anyone familiar with this process?

Rob.

 

It appears that 3 new columns have been added to SIBINST - Faculty Member Base Table.

AREGS800 inserts into SIBINST and the new column SIBINST_OVERRIDE_PROCESS_IND cannot be null.

It appears that the only value in SIBINST_OVERRIDE_PROCESS_IND currently is an ‘N’.

We will need to modify AREGS800 to insert an ‘N’ into SIBINST_OVERRIDE_PROCESS_IND.

 

SIBINST_OVERRIDE_PROCESS_IND          NOT NULL           VARCHAR2(1 CHAR) 

SIBINST_OVERRIDE_PROC_USERID                                         VARCHAR2(30 CHAR)

SIBINST_OVERRIDE_PROC_DATE                                              DATE    

Vicki.

 

 

 

 

 

Aborted Module Name:  AROSDGLI.AROSS165_01

  Date:        Day:      Time:          Resolution:

05/10/13     Fri         22:08           Restarted by Joleen.

 

Error log and follow up comments:

 

 

ORA-12899: value too large for column

ORA-06512: at line 29

 

ORA-12899: value too large for column

"CSUBAN"."GURAPAY_BACKUP"."GURAPAY_STREET_LINE1" (actual: 39, maximum: 30)

ORA-06512: at line 29

 

22:08:34  28   begin

22:08:34  29    INSERT INTO gurapay_backup

22:08:34  30    (gurapay_system_id,

22:08:34  31    gurapay_system_time_stamp,

22:08:34  32    gurapay_doc_code,

22:08:34  33    gurapay_user_id,

22:08:34  34    gurapay_pidm,

22:08:34  35    gurapay_id,

22:08:34  36    gurapay_tran_number,

22:08:34  37    gurapay_detail_code,

22:08:34  38    gurapay_desc,

22:08:34  39    gurapay_term_code,

22:08:34  40    gurapay_account,

22:08:34  41    gurapay_dr_cr_ind,

22:08:34  42    gurapay_srce_code,

22:08:34  43    gurapay_last_name,

 

22:08:34 104    ,GURAPAY_ACTIVITY_DATE

22:08:34 105    ,SYSDATE

22:08:34 106    ,v_process_date

22:08:34 107     FROM gurapay);

22:08:34 108   vin_count := sql%rowcount;

22:08:34 109  end;

 

The problem record is for gurapay_pidm 11360346 and the gurapay_street_line1 value is "Flat 604 Block 1 No 165 Hepingli Estate", which is 39 characters.  Our gurapay_backup table will need to be redefined to accept larger values, but for now can someone truncate the value to 30 characters to get our schedule restarted?

Steven Dove.

 

Kathy has truncated the value. I restarted AROSDGLI.AROSS165_01 and it has finished running.

Joleen.

 

 

 

Aborted Module Name:   KFSXAPEI.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

05/17/13     Fri          05:33           Restarted by Dermot.

05/12/14     Mon       05:30           Restarted by Dermot.

 

Error log and follow up comments:

 

04/01/14.

                Started processing step electronicInvoiceExtractStep of job KFSXAPEI.electronicInvoiceExtractStep.10537434.10537437.00 for user kr

                Executing step: electronicInvoiceExtractStep

                #### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.10537434.10537437.00-20130517-05-30-04-337.log

                *******************************************************

 

2013-05-17 05:33:03,467 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.35#> [[ 1 > 0 ]]

<#errtrap_ssh.38#> exit 1

 

cd /ais02/app/kfs/prd/logs

ls -ltr KFSXAPEI*

2013-05-17 05:33:03,016 [RMI TCP Connection(2)-129.82.111.82] INFO  org.kuali.kfs.module.purap.service.impl.Electr

onicInvoiceHelperServiceImpl :: Saving Invoice Reject for DUNS '150982189'

2013-05-17 05:33:03,019 [RMI TCP Connection(2)-129.82.111.82] INFO  org.kuali.rice.kns.document.DocumentBase :: in

voking rules engine on document 2412223

 

cd /ais02/app/kfs/prd/work/staging/purap/electronicInvoice

 

2013-05-17 07:50:49,494 [RMI TCP Connection(16)-129.82.111.82] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: Reject document has been created (DocNo=2412331)

2013-05-17 07:50:49,498 [RMI TCP Connection(16)-129.82.111.82] INFO  org.kuali.kfs.module.purap.service.impl.Elect

ronicInvoiceHelperServiceImpl :: 150982189_8053988027_25775814900711560.xml has been rejected

 

I found a white space within “150982189_8053988027_25775814900711560.xml” which I edited, I removed the “processed” xml files & restarted the failed job!

Dermot.

 

05/12/14.    

 

grep: 0652-033 Cannot open /ais02/app/kfs/prd/logs/KFSXAPEI.electronicInvoiceExtractStep.13211330.13211335.00*.

 

There was no output file & the Java step ran for just 5 seconds, I checked for processed files in directory:

/ais02/app/kfs/prd/work/staging/purap/electronicInvoice, ls –ltr *processed, there were none so obviously the job never got going so I restarted the step & it completed successfully.

Dermot.

 

 

 

 

 

 

Aborted Module Name:   AROSDBIO.AROSS141_01

  Date:        Day:      Time:          Resolution:

05/21/13     Tue        18:05           Restarted by Joleen.

 

 

Error log and follow up comments:

 

 

18:05:35 147  END; --main block.

18:05:35 148  /

old   4: lv_directory              VARCHAR2(30)  := '&&utl_path';

new   4: lv_directory              VARCHAR2(30)  := '/orautl/BANPROD';

old   5: lv_logfile                VARCHAR2(30)  := '&&utl_file1';

new   5: lv_logfile                VARCHAR2(30)  := 'AROSDBIO.AROSS141_01.utl_file1';

**** Start of AROSS141 **** 05/21/2013 18:05:35

DECLARE

*

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: Employee and Associate address

updates must be made in the HR system.

ORA-06512: at line 104

 

Our AROSDBIO flow aborted last night.  Can you tell me what the error is?

 

 

Steven Dove.

 

A little bit more info from /orautl/BANPROD/AROSDBIO.AROSS141_01.utl_file1

Insert failed: 829912301 Moore -20100 ORA-20100: Employee and Associate address updates must be made in the HR system.

Steve G.

 

I’ve found the problem data, but I don’t see a way to delete it through any form.  Can someone delete it from the database and then we can restart our schedule?  Janet is going to check with the end user to see how they were able to create an address update record for an employee.

-- This should be 1 row

delete from twrcust where twrcust_id = '823493035';

Steven Dove.

 

Mark P has removed the CSUS_TERM_INFO_CUR, SPR, SMR, FAL from ODSRS002, so we are ready to restart this chain.

Vicki.

 

 

 

 

 

 

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS001_01

  Date:        Day:      Time:          Resolution:

05/22/13     Wed        03:24           Restarted by Robin.

 

Error log and follow up comments:

 

00:04:49  22  --*  THE FOLLOWING 2 LINES ARE REQUIRED

00:04:49  23  --*    .       -- THIS ENDS THE INPUT MODE FOR

A PL/SQL BLOCK IN SQLPLUS

00:04:49  24  --*    /       -- THIS EXECUTES THE PL/SQL

BLOCK STORED IN THE BUFFER

00:04:49  26  .

00:04:49 SQL> /

old   6:     csug_ods_refresh.log_begin_time

('&REFRESH_APP');

new   6:     csug_ods_refresh.log_begin_time

('REFRESH_STUDENT');

old   8:     ia_admin.mgkmap.P_RunETLMapSlots('&USERNO',

job, '&REFRESH_APP', NULL, '');

new   8:     ia_admin.mgkmap.P_RunETLMapSlots('3',

job, 'REFRESH_STUDENT', NULL, '');

old  14:     csug_ods_refresh.log_end_time('&REFRESH_APP');

new  14:     csug_ods_refresh.log_end_time

('REFRESH_STUDENT');

 

The ODS job appears to have failed due to a data error.

The following mapping experienced a unique constraint error on MST_GENSTU_END_TERM_INDEX_01. I will advise scheduling it is ok to let the error go until morning.

 

UPDATE_MST_GENRL_STDNT_STEP_1

Shawn.

 

The index on MST_GENSTU_END_TERM is PERSON_UID, ACADEMIC_PERIOD_START (I believe)

If I look at MST_GENSTU_END_TERM and AS_GENSTU_END_TERM for more than 1 occurrence of PERSON_UID, ACADEMIC_PERIOD_START, I do not see it. select PERSON_UID, ACADEMIC_PERIOD_START from as_genstu_end_term group by PERSON_UID, ACADEMIC_PERIOD_START having count(*) > 1; This is getting very serious - no CSUS_TERM_INFO data or updated data and no updates to CSUS_SECTION_INFO data!

We are going to have to figure what to say to campus and get something out on the ODS List Serv this morning

Vicki.

 

 

 

 

 

 

Aborted Module Name:   AREGFQTR.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

05/25/13     Sat          12:21           Restarted by David.

 

Error log and follow up comments:

 

 

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  colora-88@iwantmytranscript.com

# > sftp> pwd

# > Remote working directory: /home/colora-88 # > sftp> lpwd # > Local working directory: /ais101jfs/jobprd # > sftp> lls -l /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML

# > -rw-rw----    1 appworx  Gprd            291 May 25 12:21 /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML

# > sftp> -ls -l /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Couldn't stat remote file: No such file or directory # > Can't ls: "/home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml" not found # > sftp> put /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Uploading /ais01/bkp/AREGTTRN.AREGS621_01.10601658.XML to /home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml

# > Couldn't write to remote file "/home/colora-88/statuses/awaiting-process/colorado_state.20130525_122151.status.xml": Failure # > (1)

 

This chain sends Electronic Transcripts to TOD.  I am not familiar with it and had hoped to see Phil or Rob reply.  I am also including Matt in this conversation because he may have a suggestion as to what to do.

Vicki.

 

I couldn't tell from that output if it means we could not connect to their site, or if there is an internal issue here.  I have sent an email to scrip-safe to see if they could check on things on their side of this and see if they find anything not working/setup correctly.

I see that their ordering site is back up.  I also just got an email from them asking us to try again, as all their other schools are connecting fine as far as connecting to pull orders and return status files.

Can we retry the job/chain/process?

Matt.

 

AREGFQTR.SSH_SFTP_01 has been reset and completed successfully. Transcripts are running again.

David.

 

Looks like about 94 transcripts got processed and had emails sent out from eScrip-Safe at about 4:30 PM our time...

Matt.

 

 

 

Aborted Module Name: FAIDDLDR_EV.RPRDU14_01  

  Date:        Day:      Time:          Resolution:

05/30/13     Thu        12:52           Deleted by David.

 

Error log and follow up comments:

 

%Error% - Invalid or previously processed file (/ais02/dat/finaid/mpninaop.dat)

Processing MPN Due to Expire Report Acknowledgements...

%Error% - Invalid or previously processed file (/ais02/dat/finaid/mpnexpop.dat)

 

I noticed that the mpndisop.dat and mpninaop.dat message classes are in both the OD and EV FAIDSAIG driver files. Could this be the problem with FAIDDLDR_EV.RPRDU14_01?

David.

 

We normally have the message classes in both schedules of FAIDSAIG since they’re not aid year specific. The messages listed below are standard errors that come out of the RPRDUXX process. Here’s similar output from the FAIDDLDR_OD from 05/25/13:

Since the same error messages appears on almost every run of FAIDDLDR_OD could there be another reason the EV schedule aborted? (It was the first run of the year and this is a job we “test” in production.)

Karma.

 

I see that RERIM-LOOP_01 ran about 70 iterations.

I’m not seeing the reason for the RPRDU14 failure. I’ll keep looking. You are saying that the errors reported are typical correct?

David.

 

Based on what I see the errors listed for the mpndisop.dat, mpninaop.dat and mpnexpop.dat files are consistent with what we see in the OD schedule that’s running with RPRDU13.

There is a new exit counseling file, AHSLDEOP, that’s being brought in with RPRDU14. Is it possible there’s a problem with that file? I looked at the FAIDSAIG_EV output and it didn’t look like we picked any files up but I thought I’d throw it out there.  

Karma.

 

We discovered an appman output scan that was looking for the text ‘error’ in the output log file. This is why RPRDU14 aborted. Since RPRDU14 actually did run successfully, I deleted the component so the Process flow could continue. I have removed this output scan from RPRDU14. If we want to scan for any specific errors in the RPRDU14 we would need to create a new output scan specific to it. I noticed that neither RPRDU13 nor RPRDU12 had output scans so I’m not sure why it was added to RPRDU14.

David.

 

 

 

Aborted Module Name:   HRMSENCD.HRMSS079_01

  Date:        Day:      Time:          Resolution:

06/05/13     Wed       18:01           Restarted by Dermot.

10/18/13     Fri          18:01           Restarted by David.

 

Error log and follow up comments:

 

 

/ais01/dat/work/prod/HRMSENCD.HRMSS079_01.10680315.10680318.00.2013_06_05_1801_sql_followup

+ cat /ais01/dat/work/prod/HRMSENCD.HRMSS079_01.10680315.10680318.00.2013_06_05_1801_sql_followup

***

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

06/05/2013 19:03    DBARRETT

Received page for critical job failure in HRMSENCD chain.

It's the error we sometimes see in HRMSS079 with resource busy. 

I tried to resubmit HRMSS079, but it failed again with same message.

I called Mark. B (on-call DBA) to check the HR database?

Mark called back & had me restart the failed job & it completed successfully.

 

Thanks for the information.  The next time this happens we will get the session information and dig a bit deper to try and identify the underlying problem.

Steve H.

 

10/18/13.

Dawn Received page for critical job failure in HRMSENCD chain and notified me.

It's the error we sometimes see in HRMSS079 with resource busy.  I tried to resubmit HRMSS079,

it failed again with same message. I called Mark. B (on-call DBA) to check the HR database.

Mark had me re-start several times and we finally got lucky. It completed successfully.

 

 

 

 

Aborted Module Name:   KFSXAPAP.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

06/11/13     Tue       19:36             See notes below.

 

Error log and follow up comments:

 

2013-06-11 19:35:54,717 [main] INFO  org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient ::

                *******************************************************

                Started processing step autoApprovePaymentRequestsStep of job KFSXAPAP.autoApprovePaymentRequestsStep.10724485.10724487.00 for user kr

                Executing step: autoApprovePaymentRequestsStep

                #### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXAPAP.autoApprovePaymentRequestsStep.10724485.10724487.00-20130611-19-35-47-749.log

                *******************************************************

 

2013-06-11 19:35:54,719 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kns.exception.ValidationException: business rule evaluation failed

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

 

2013-06-11 19:35:53,159 [RMI TCP Connection(32)-129.82.111.82] INFO  org.kuali.kfs.module.purap.document.service.impl.PaymentRequestServiceImpl ::  -- Initial filtering complete, returned 24 docs.

2013-06-11 19:35:53,862 [RMI TCP Connection(32)-129.82.111.82] INFO  org.kuali.rice.kns.document.DocumentBase :: invoking rules engine on document 2299826

2013-06-11 19:35:53,867 [RMI TCP Connection(32)-129.82.111.82] INFO  org.kuali.kfs.module.purap.document.PurchasingAccountsPayableDocumentBase :: Checking persisted source accounting lines for read-only fields

2013-06-11 19:35:53,878 [RMI TCP Connection(32)-129.82.111.82] INFO  org.kuali.kfs.module.purap.document.PurchasingAccountsPayableDocumentBase :: Checking source accounting lines for read-only fields

2013-06-11 19:35:54,115 [RMI TCP Connection(32)-129.82.111.82] ERROR org.kuali.kfs.module.purap.document.validation.impl.PaymentRequestReviewValidation :: validatePaymentRequestReview() Payment Request 254230, Item 1 has quantity '1.00' but outstanding encumbered quantity 0.00

 

KFSXAPAP.KFSX_JAVA_01 / KFSXAPAP_PURAP_APPROVE_PYMTS ABORTED last night, the job cancelled itself & did not hold up the schedule. The document 2299826 was reported to Swaro & It ran successfully the next evening.

Dermot.

 

 

 

 

Aborted Module Name:  KFSXFPPD.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

06/13/13     Thu       14:23             Removed illegal characters, Shawn updated table, job restarted.

 

 

Error log and follow up comments:

 

2013-06-13 14:23:18,570 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.jdbc.UncategorizedSQLException: OJB operation; uncategorized SQLException for SQL []; SQL state [72000]; error code [12899]; ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

; nested exception is java.sql.SQLException: ORA-12899: value too large for column "KFSUSER"."PDP_PMT_NTE_TXT_T"."CUST_NTE_TXT" (actual: 92, maximum: 90)

 

This is the output from the sql I ran (see sql script below).

1784540 char = 49840 ° Position:83

1980907 char = 49793 Position:71

1980907 char = 49793 Position:79

1980907 char = 49793 Position:104

1980907 char = 49793 Position:108

2149640 char = 50102 ö Position:4

2271266 char = 49810 ? Position:39

2321383 char = 50076 Ü Position:4

2427751 char = 50051 Ã Position:70

2427751 char = 49833 © Position:71

2429665 char = 50051 Ã Position:70

2429665 char = 49833 © Position:71

2449892 char = 50051 Ã Position:78

2449892 char = 49833 © Position:79

 

declare

  bad varchar2(20);

  vchar varchar2(20);

  loop_size number;

cursor s1 is

  select fdoc_nbr,

         dv_chk_stub_txt,

         to_number(length(dv_chk_stub_txt)) sz

  from fp_dv_doc_t

  where dv_chk_stub_txt is not null;

    begin

  for x in s1 loop

   loop_size := to_number(x.sz) + 5;

   for i in 1..loop_size loop

        bad := ascii(substr(x.dv_chk_stub_txt,i,1));

        vchar := substr(x.dv_chk_stub_txt,i,1);

        if to_number(bad) > 255 then

          dbms_output.put_line(x.fdoc_nbr|| ' char = '|| bad ||' ' || vchar || ' Position:' || i );

        end if;

     end loop;

  end loop;

End;

Dermot.

 

 

 

 

Aborted Module Name:   FAIDLORC_EV.LYNX_01

  Date:        Day:      Time:          Resolution:

06/15/13     Sat          00:29           See note below.

09/23/13     Mon       15:22           Restarted by Joleen.

 

Error log and follow up comments:

 

 

Error:

ORA-12571: TNS:packet writer failure

 

I don’t know what’s wrong with this one, but I ran that page manually and it didn’t error.  Can you start up again from where the job left off?  You wouldn’t need to run the LYNX (since I just ran that page), but you could start at the next step.

Sorry, I don’t know more about that one L

Candy.

 

I bypassed the aborted LYNX_01 and ran the rest of the job. FAIDLORC_DIRECT_LOAN_REC has finished running.

Joleen.

 

09/23/13.

wc: 0653-755 Cannot open /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.status.txt.

       0 /appworx/out/LYNX_11510533.00.stdout.txt

      15 /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.stderr.txt

      15 total

*** /appworx/out/FAIDLORC_EV.LYNX_03.11510533.00.stderr.txt ***

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Alert!: Unexpected network read error; connection aborted.

Can't Access `http://wsnet.colostate.edu/cwis231/autorun/disb_discrepency.aspx?ay=1314'

Alert!: Unable to access document.

lynx: Can't access startfile

***

[101] : *** ERROR Detected in Output : File Empty ***

 

We apparently had a server app pool issue, but I think you could give it a try again.

Candy.

 

 

 

Aborted Module Name:   AROSDBIO.AROSS141_01

  Date:        Day:      Time:          Resolution:

06/19/13     Wed       18:05           Restarted by Joleen.

 

Error log and follow up comments:

 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-20100: AROSS141 Failure: -20100 ORA-20100: Employee and Associate address

ORA-06512: at line 104

Robin.

 

Here is more information on the abort:

 

Insert failed: 822253958 Garber -20100 ORA-20100: Employee and Associate address updates must be made in the HR system.

Joleen.

 

I removed the problem record.  Can you restart our production schedule?

Steven Dove.

 

 

 

 

Aborted Module Name:   AREGRTWL_SM.SFRBWLP_01

 

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        00:05           Deleted by Joleen, see note below.

 

Error log and follow up comments:

 

Here Is the .shl file:

$JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1 progRet=$?

/bin/rm $H/$TEMP.in 1>>$LOG 2>&1

/bin/rm $H/$TEMP.shl 1>>$LOG 2>&1

exit $progRet

Joleen.

 

sleepwake process flow AREGRTWL_* is being run every 30 minutes in AppMan.

For SM: 05 and 35  every hour

For FA: 09 and 39  every hour 

Once sleepwake is running again and the table entry exists AppMan is fine.

At the moment job aborts of Banner job sfrbwlp continue to accumulate in backlog.

The usual email expected to be generated by AppMan AREGRTWL_* sleepwake process restart was NOT sent out last night.

Recipients REGSCHED_LIST@colostate.edu and IS DL: Alert AGEN AREG.

Example from last Sunday:

For Fall:

----------------------------------------------------------------

***  Sleep Wake process SFRBWLP was restarted for FA_SFRBWLP

***  Next Execution: 2013/06/17 00:15:23

***  System Time:    2013/06/17  00:10:56

----------------------------------------------------------------

For Summer:

----------------------------------------------------------------

***  Sleep Wake process SFRBWLP was restarted for SM_SFRBWLP

***  Next Execution: 2013/06/17 00:11:01

***  System Time:    2013/06/17  00:06:52

----------------------------------------------------------------

Gudrun.

 

I passed this along to Vicki and Phil...I think this might be a set up issue in Banner.  I haven't heard back from them yet. 

Mark B.

 

All AWPROD sleepwake AREGRTWL process flows with aborted sfrbwlp jobs can be deleted.

Just keep one in backlog so that a test run can be performed once DBAs have a fix.

I logged issue with them last night but it has been a long day yesterday and not sure how far they got. 

Gudrun.

 

 

 

Aborted Module Name:  KFSXAPPC.KFSXS074_01  

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        19:21           Deleted by Dermot.

 

Error log and follow up comments:

 

old   6: utlpath  varchar2(255) := '&&utl_path';

new   6: utlpath  varchar2(255) := '/orautl/kfsprd';

old   7: infile1  varchar2(80)         := '&&utl_file1';

new   7: infile1  varchar2(80)       := 'KFSXAPPC.KFSXS074_01.utl_file1';

old   8: outfile1 varchar2(80)       := '&&utl_file2';

new   8: outfile1 varchar2(80)     := 'KFSXAPPC.KFSXS074_01.utl_file2';

**** Start of KFSXS074 06/24/2013 19:21:15

237239 Already Closed.

ORA-06502: PL/SQL: numeric or value error: character to number conversion error

**** STATISTICS *****

Number of Records Read               = 225

Number of Records written = 223

Number of Errors               = 1

Successfully Complete.

 

 

This is a data issue that I will need to look into.

 

Go ahead and cancel the aborted job.

It looks like all but one PO will get closed.

We can manully close it in the morning.

Josh.

 

 

 

 

Aborted Module Name: FAIDRGRT_EV.RWRDCLN_01 

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        21:34           Restarted by Gudrun.

 

Error log and follow up comments:

 

 

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119750[^0-9].*

Search /appworx/out using .*[^0-9']3119750[^0-9].*

Files [/appworx/out/rwrdcln_3119750.log]

User rename pattern:FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.{ONE_UP}.{fileext}

User rename evaluated: FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/rwrdcln_3119750.log

to:/appworx/out/FAIDRGRT_EV.RWRDCLN_01.BANNER.10819190.10819196.00.3119750.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl rwrdcln C jobprd pass 3119750 NOPRINT 

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

 

All is not quite well yet in AppMan Banner world. There are two jobs aborted in AWPROD that failed because they encountered a memory fault error during execution.

Only an empty Banner .log file is created. .lis file is missing.

 

Not all Banner C jobs appear to throw this error so something is different about these jobs.

 

SWPCOFA is run as part of FAID process flow FAIDCFA2_COF_ATTRIBUTES_2

 AGENC001 is run as part of AGEN process flow AGENDYHB_HRMS_NEW_PERSON_BRDG

 Could these jobs possibly be recompiled ?  I will restart and see if issue remains ?

+ 1>> /appworx/out/swpcofa_3119560.in

 

+ echo $JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1

+ 1>> /appworx/out/swpcofa_3119560.shl

+ echo progRet=$?

+ 1>> /appworx/out/swpcofa_3119560.shl

 

Restart of jobs agenc001 and rwrdcln was successful. They finished.

I have to check out swpcofa a bit more before resetting.

If you have compiled swpcofa like the other I believe they should be fine.

Any other custom ones that need recompiling ?

Memory fault error of certain Banner C jobs got fixed. Mark Britton had to recompile them. Apparently these are custom Banner C jobs and this was not taken into account when compiling them the first time.

Reset of agenc001 and rwrdcln  was successful.

Gudrun.

 

 

 

Aborted Module Name:  AGENDYHB.AGENC001_01

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        19:02           Restarted by Gudrun.

 

Error log and follow up comments:

 

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119612[^0-9].*

Search /appworx/out using .*[^0-9']3119612[^0-9].*

Files [/appworx/out/agenc001_3119612.log]

User rename pattern:AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

User rename evaluated: AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/agenc001_3119612.log

to:/appworx/out/AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.3119612.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl agenc001 C jobprd pass 3119612 NOPRINT 

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

 

All is not quite well yet in AppMan Banner world. There are two jobs aborted in AWPROD that failed because they encountered a memory fault error during execution.

Only an empty Banner .log file is created. .lis file is missing.

 

Not all Banner C jobs appear to throw this error so something is different about these jobs.

 

SWPCOFA is run as part of FAID process flow FAIDCFA2_COF_ATTRIBUTES_2

 AGENC001 is run as part of AGEN process flow AGENDYHB_HRMS_NEW_PERSON_BRDG

 Could these jobs possibly be recompiled ?  I will restart and see if issue remains ?

+ 1>> /appworx/out/swpcofa_3119560.in

 

+ echo $JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1

+ 1>> /appworx/out/swpcofa_3119560.shl

+ echo progRet=$?

+ 1>> /appworx/out/swpcofa_3119560.shl

 

Restart of jobs agenc001 and rwrdcln was successful. They finished.

I have to check out swpcofa a bit more before resetting.

If you have compiled swpcofa like the other I believe they should be fine.

Any other custom ones that need recompiling ?

Memory fault error of certain Banner C jobs got fixed. Mark Britton had to recompile them. Apparently these are custom Banner C jobs and this was not taken into account when compiling them the first time.

Reset of agenc001 and rwrdcln  was successful.

Gudrun.

 

 

 

Aborted Module Name:  FAIDCFA2_SM.SWPCOFA_06

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        17:13          Restarted by Gudrun.

 

Error log and follow up comments:

 

 

spawned_module_name=SWPCOFA

+ . SRC_APMX_STATUS_FOR_SPAWNED.KSH

+ set -x

+ awexe jh

+ grep 10819680

+ egrep ABORTED|CRITFAIL|C-Error

     10819680.00 BANNER    FAIDCFA2_SM.SWPCOFA_06/24 17:13 00:00:15 ABORTED                FAIDCFA2_COF_ATTRIBUTES_2

+ print Failure in spawned SWPCOFA - abort this module

Failure in spawned SWPCOFA - abort this module

+ exit 1

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119612[^0-9].*

Search /appworx/out using .*[^0-9']3119612[^0-9].*

Files [/appworx/out/agenc001_3119612.log]

User rename pattern:AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

User rename evaluated: AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/agenc001_3119612.log

to:/appworx/out/AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.3119612.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl agenc001 C jobprd pass 3119612 NOPRINT 

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

 

All is not quite well yet in AppMan Banner world. There are two jobs aborted in AWPROD that failed because they encountered a memory fault error during execution.

Only an empty Banner .log file is created. .lis file is missing.

 

Not all Banner C jobs appear to throw this error so something is different about these jobs.

 

SWPCOFA is run as part of FAID process flow FAIDCFA2_COF_ATTRIBUTES_2

 AGENC001 is run as part of AGEN process flow AGENDYHB_HRMS_NEW_PERSON_BRDG

 Could these jobs possibly be recompiled ?  I will restart and see if issue remains ?

+ 1>> /appworx/out/swpcofa_3119560.in

 

+ echo $JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1

+ 1>> /appworx/out/swpcofa_3119560.shl

+ echo progRet=$?

+ 1>> /appworx/out/swpcofa_3119560.shl

 

Restart of jobs agenc001 and rwrdcln was successful. They finished.

I have to check out swpcofa a bit more before resetting.

If you have compiled swpcofa like the other I believe they should be fine.

Any other custom ones that need recompiling ?

Memory fault error of certain Banner C jobs got fixed. Mark Britton had to recompile them. Apparently these are custom Banner C jobs and this was not taken into account when compiling them the first time.

Reset of agenc001 and rwrdcln  was successful.

Gudrun.

 

 

 

Aborted Module Name:  FAIDCFA2_FA.SWPCOFA_06

  Date:        Day:      Time:          Resolution:

06/24/13     Mon        17:12           Restarted by Gudrun.

 

Error log and follow up comments:

 

+ egrep ABORTED|CRITFAIL|C-Error

     10819677.00 BANNER    FAIDCFA2_FA.SWPCOFA_06/24 17:13 00:00:16 ABORTED                FAIDCFA2_COF_ATTRIBUTES_2

+ print Failure in spawned SWPCOFA - abort this module

Failure in spawned SWPCOFA - abort this module

+ exit 1

Output directory /appworx/out

User search pattern:.*[^0-9']{ONE_UP}[^0-9].*

User search evaluated: .*[^0-9']3119612[^0-9].*

Search /appworx/out using .*[^0-9']3119612[^0-9].*

Files [/appworx/out/agenc001_3119612.log]

User rename pattern:AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

User rename evaluated: AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.{ONE_UP}.{fileext}

Renaming file from:/appworx/out/agenc001_3119612.log

to:/appworx/out/AGENDYHB.AGENC001_01.BANNER.10817506.10817507.00.3119612.log

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl agenc001 C jobprd pass 3119612 NOPRINT 

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

 

All is not quite well yet in AppMan Banner world. There are two jobs aborted in AWPROD that failed because they encountered a memory fault error during execution.

Only an empty Banner .log file is created. .lis file is missing.

 

Not all Banner C jobs appear to throw this error so something is different about these jobs.

 

SWPCOFA is run as part of FAID process flow FAIDCFA2_COF_ATTRIBUTES_2

 AGENC001 is run as part of AGEN process flow AGENDYHB_HRMS_NEW_PERSON_BRDG

 Could these jobs possibly be recompiled ?  I will restart and see if issue remains ?

+ 1>> /appworx/out/swpcofa_3119560.in

 

+ echo $JOB -f -o $H/$TEMP.lis 0<$H/$TEMP.in 1>$LOG 2>&1

+ 1>> /appworx/out/swpcofa_3119560.shl

+ echo progRet=$?

+ 1>> /appworx/out/swpcofa_3119560.shl

 

Restart of jobs agenc001 and rwrdcln was successful. They finished.

I have to check out swpcofa a bit more before resetting.

If you have compiled swpcofa like the other I believe they should be fine.

Any other custom ones that need recompiling ?

Memory fault error of certain Banner C jobs got fixed. Mark Britton had to recompile them. Apparently these are custom Banner C jobs and this was not taken into account when compiling them the first time.

Reset of agenc001 and rwrdcln  was successful.

Gudrun.

 

 

 

Aborted Module Name:   FAIDCFEX_FA.SWPCOFE_01

  Date:        Day:      Time:          Resolution:

06/25/13     Tue        10:22           Restarted by Joleen.

 

Error log and follow up comments:

 

 

Below is the error we located, maybe swfcofe needs to be compiled?

 

Cap: Error executing Banner commandError code returned:139

Command:/appworx/banner/banprod/UC4gjajobs.shl swpcofe C jobprd pass 3120593 NOPRINT 

******************************************* End of job log *******************************************

Job Aborted: : Error executing Banner commandError code returned:139

 

NOTE: This is the same error (memory error) we getting last night for the following in Gudrun’s message:

The ones aborted  in AWPROD with the memory error are:

 

Agenc001

Swpcofa

Rwrdcln

 

Also, I will be running SWPCOFI next, so maybe it needs to be compiled?

 

This was recompiled last night.  I will redo it again but we may have a more severe issue with this one.

Mark B.

 

It looks like it is working. I reset one of the aborted SWPCOFE jobs and it has been running for more than 8 minutes now.

Joleen.

 

 

 

Aborted Module Name:   AREGCNTB.ODSRS100_01

  Date:        Day:      Time:          Resolution:

06/29/13     Sat        07:57            Restarted by David.

 

Error log and follow up comments:

 

ORA-00001: unique constraint (CSUBAN.CSUS_APPLICANT_CEN_CUR_IX_01) violated

ORA-06512: at line 267

07:56:18 263  --*--------------------------------------------------------------------*

07:56:18 264  --************ ADD Records to CUR table from view course_schedule *****

07:56:18 265  --*--------------------------------------------------------------------*

07:56:18 266                           begin <<add_cur3>>

07:56:18 267                                       insert into csus_applicant_cen_cur

07:56:18 268                                       (select * from csus_applicant

07:56:18 269                                        where ltrim(rtrim(term)) = csus_f_cur_term_ods);

 

OK I have figured out what the problem is

We have 2 Fees Paid records for

10841207             826234354           Kundid, Tobi Jean for Application Reference Number 4 MST_ADMISSIONS_REQUIRMENTS

There should only be one record returned from the following, but there are 2

I think that the FE50 record should be removed from MST_ADMISSIONS_REQUIREMENT.  I will need someone to help do this.  The select statement that the delete statement can be built from is at the bottom of my reply.

Mark can you do this?

select *

from MST_ADMISSIONS_REQUIREMENT R

where R.PERSON_UID = 10841207

and   R.APPLICATION_NUMBER = 4

AND R.REQUIREMENT LIKE 'F%';

10841207             2013       2013-2014            201360  Summer Session 2013                                    4              FE50       $50 Application Processing Fee  Y                              Y                                                              23-MAY-13                                                          U             Y                                                                              06-JUN-13           06-JUN-13           U

10841207             2013       2013-2014            201360  Summer Session 2013                                    4              F150       $150 Readmission Fee        Y                              Y                                                              03-JUN-13                           $100 check/$50 via credit card                    U                Y                                                                              06-JUN-13           06-JUN-13           U

 

Select Statement that selects the record to be deleted from ODSPROD that is causing CSUS_APPLICANT_CEN_CUR to get a unique constraint error.

select *

from MST_ADMISSIONS_REQUIREMENT R

where R.PERSON_UID = 10841207

and   R.APPLICATION_NUMBER = 4

AND R.REQUIREMENT    = 'FE50';

Vicki.

 

The record has been deleted in ODS Prod.

Mark. P.

 

AREGCNTB is complete. Thanks to Vicki and Mark P. for resolving duplicate record issue.

I have added a condition to make AREGCNTB critical, since this is such an important process flow.

David.

 

 

 

 

 

Aborted Module Name:   AREGRTWL_FA.SFRBWLP_01

  Date:        Day:      Time:          Resolution:

07/01/13     Mon       00:09            Restarted by David.

 

Error log and follow up comments:

 

Check Backlog for ABORTED jobs (so_status  202)                                           

Job                    Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

---------------------- -------- ----------------------- ------ ----------- --------------- -------------------

AREGRTWL_FA.SFRBWLP_01 10865736 07-01-2013 00:09:58 MDT    202 NOBANLOG     51.48     1509        2931

 

 

I alerted DBA oncall about the failure. It appears SFRBWLP needs to be further investigated. The manual restart by a functional user last Monday did not fix the issue. For some reason ever since the last upgrade a restart of this job on Sundays at 00:05 continues to fail. Only known  fix is to manually restart it by a functional user.

Same was not necessary prior to the upgrade.

Again below SQL for subvar AREGRTWL_FA_NEXT_EXECUTION does not return a value.

SELECT nvl(to_char(gjrswpt_next_execution, 'yyyy/mm/dd hh24:mi:ss'), to_char((sysdate + 1), 'yyyy/mm/dd hh24:mi:ss')) from gjrswpt where gjrswpt_process='SFRBWLP'

and gjrswpt_continue_ind = 'Y'

and gjrswpt_printer='FA_SFRBWLP'

Value returned: blank

Checking other values it appears gjrswpt_continue_ind is set to 'N'. 

SELECT nvl(to_char(gjrswpt_next_execution, 'yyyy/mm/dd hh24:mi:ss'), to_char((sysdate + 1), 'yyyy/mm/dd hh24:mi:ss')) from gjrswpt where gjrswpt_process='SFRBWLP'

and gjrswpt_continue_ind = 'N'

and gjrswpt_printer='FA_SFRBWLP'

Value returned: 2013/06/30 22:39:11

AREG developer needs to check on data setup for this job.

 

After talking to Shawn we concluded job needs further investigation. DBA also not aware at this point how to restart manually. In short, a restart of the job manually by a functional user as done last Monday did not fix the underlying job restart issue.  Like last weekend below SQL for subvar AREGRTWL_FA_NEXT_EXECUTION does not return a value.

SELECT nvl(to_char(gjrswpt_next_execution, 'yyyy/mm/dd hh24:mi:ss'), to_char((sysdate + 1), 'yyyy/mm/dd hh24:mi:ss')) from gjrswpt where gjrswpt_process='SFRBWLP'

and gjrswpt_continue_ind = 'Y'

and gjrswpt_printer='FA_SFRBWLP'

Value returned: blank

Checking other values it appears gjrswpt_continue_ind is set to 'N' instead of the expected 'Y' !

SELECT nvl(to_char(gjrswpt_next_execution, 'yyyy/mm/dd hh24:mi:ss'), to_char((sysdate + 1), 'yyyy/mm/dd hh24:mi:ss')) from gjrswpt where gjrswpt_process='SFRBWLP'

and gjrswpt_continue_ind = 'N'

and gjrswpt_printer='FA_SFRBWLP'

Value returned: 2013/06/30 22:39:11

AREG developer needs to check on data setup for this job. Continue indicators gets changed to N. Why ?

Gudrun.

 

Gudrun called to report that SFRBWLP could not be restarted. This occured at about the same time last week also. Per Gudrun, DBA email exchanges, and the I.S. news entry at the time, restarting had to be performed manually in the morning. It appears some kind of coordination or setup will be required by functional users in the morning.

Shawn.

 

 

 

 

Aborted Module Name:   FAIDALCT.SSH_SFTP_RN_02

  Date:        Day:      Time:          Resolution:

07/01/13     Mon       07:04            Restarted by Dermot.

 

 

Error log and follow up comments:

 

 

***

*** SEARCH OF FTP JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

# > Permission denied (password,gssapi-with-mic).

 

 

I was able to successfully manually connect to the server.  We should be able to restart it.  Please be sure to check for any notes in the process flow/job.  Please let me know if you have any questions or if there are no restart notes.

Elden.

 

 

 

 

Aborted Module Name:   FAIDSUNT.LYNX_01

  Date:        Day:      Time:          Resolution:

07/01/13     Mon       22:01           Restarted by Joleen.

07/02/13     Tue        22:02           Restarted by Joleen.

 

Error log and follow up comments:

 

07/01/13.  

[ArgumentOutOfRangeException]: StartIndex cannot be less than zero.

 

This is fixed now, you can chose to re-run it or skip the lynx module.

Zach.

 

I have skipped the LYNX module. FAIDSUNT_SUMMER_NOTIFICATIONS has finished running.

Joleen.

 

07/02/13.    

[OracleException]: ORA-00936: missing expression

 

I fixed the issue, you can either re-start the job or finish it without running the lynx module.

Zach.

 

Thank you for all your help! FAIDSUNT_SUMMER_NOTIFICATIONS has finished running.

Joleen.

 

 

 

 

 

Aborted Module Name: KFSXFPPY.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

07/05/13     Fri          19:19           Restarted by Dermot.

07/07/14     Mon       19:25           Restarted by Dermot.

 

Error log and follow up comments:

 

07/05/13.

File XML Dept Contact Info :  CenRec Shipping CENREC_iship_uploads@mail.colostate.edu 491-5346 CenRec 225.02 2013-07-05

                Finished step: ValidateCollectorXml

                Step ValidateCollectorXml of KFSXGLCL_D1.ValidateCollectorXml.10902104.10903822.00 took 66 milliseconds to complete

               *******************************************************

                Finished step: procurementCardAutoApproveDocumentsStep

                Step procurementCardAutoApproveDocumentsStep of KFSXFPAA.procurementCardAutoApproveDocumentsStep.10902482.10902486.00 took 2.4310666666666667 minutes to complete

 

2013-07-05 19:21:50,174 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.dao.DataAccessResourceFailureException: Could not open OJB PersistenceBroker; nested exception is org.apache.ojb.broker.PBFactoryException: Transaction synchronization failed - wrong status of external JTA tx. Expected was an 'active' or 'no transaction', found status is 'STATUS_MARKED_ROLLBACK'

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.35#> [[ 1 > 0 ]]

<#errtrap_ssh.38#> exit 1

<</ais02/job/prod/kshexe_ssh.92>> errtrap_ssh kshexe_ssh 1

 

As per Josh I restarted the job & it finished successfully.

Dermot.

 

 

07/07/14.

The job failed with an "OptimisticLock" error:

Caused by: org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: org.kuali.rice.kew.useroptions.UserOptions@7e585da9

I restarted the job & it is now running.

Dermot.

 

 

 

 

 

 

Aborted Module Name:   KFSXFPPD.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

07/12/13     Fri          14:01           Restarted by Dermot.

 

Error log and follow up comments:

 

                *******************************************************

                Started processing step disbursementVoucherPreDisbursementProcessorExtractStep of job KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.10949334.10949378.00 for user kr

                Executing step: disbursementVoucherPreDisbursementProcessorExtractStep

                #### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXFPPD.disbursementVoucherPreDisbursementProcessorExtractStep.10949334.10949378.00-20130712-14-00-17-517.log

                *******************************************************

 

2013-07-12 14:24:32,129 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springmodules.orm.ojb.OjbOperationException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: financialSystemDocumentHeader(documentNumber)=(2500580)

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

<#errtrap_ssh.35#> [[ 1 > 0 ]]

<#errtrap_ssh.38#> exit 1

 

“OptimisticLockException” restarted the job & it completed okay.

Dermot.

 

 

 

 

 

Aborted Module Name:   HRMSVSTA.VPLUS_HIERARCHY_01

  Date:        Day:      Time:          Resolution:

07/13/13     Sat          12:34           See follow up below.

 

Error log and follow up comments:

 

$ hostname

Monarch

$ whoami

jobprd

$ pwd

/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416

 

I logged on successfully to monarch via SSH and found below error message in

file HRMSVSTA.VPLUS_HIERARCHY_01.00000000.00000000.00.2013_07_13_1234.LOG

 

# 2013.07.13-12:59:56 : >>  [vadmin_page_security::vadmin_delete_page_access]

# 2013.07.13-12:59:56 : >>  [vadmin_page_security::exec_cmd]

# vadmin_page_security::exec_cmd: cmd=[/vplus/vadmin HOST=vplusprod.is.colostate.edu command="DeletePageAccess" rep

ort="HRMSMGMT.HRMSR001SS_01" set="Full_Access-HRMSMGMT.HRMSR001SS_01" verify=n 2>&1 |]

# ** EXCEPTION **     : FATAL: cmd=[/vplus/vadmin HOST=vplusprod.is.colostate.edu command="DeletePageAccess" report

="HRMSMGMT.HRMSR001SS_01" set="Full_Access-HRMSMGMT.HRMSR001SS_01" verify=n 2>&1 |]

# ** EXCEPTION **     : FATAL: Unknown error returned by Vista Plus server.

# ** EXCEPTION **     : FATAL: HOST: vplusprod.is.colostate.edu  --  PORT: 7980

# ** EXCEPTION **     : FATAL: close status=(53248) (Command failed with code) (208) syserr=()

# ** FATAL CALLER **  : [0] : vadmin_page_security;/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416/VPL

US_HIERARCHY.PL;680;vadmin_page_security::fatal;1

# ** FATAL CALLER **  : [1] : vadmin_page_security;/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416/VPL

US_HIERARCHY.PL;3578;vadmin_page_security::exec_cmd;1

# ** FATAL CALLER **  : [2] : vadmin_page_security;/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416/VPL

US_HIERARCHY.PL;6982;vadmin_page_security::vadmin_delete_page_access;1

# ** FATAL CALLER **  : [3] : vadmin_page_security;/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416/VPL

US_HIERARCHY.PL;7145;vadmin_page_security::reset_vista_delete_report_accesses;1

# ** FATAL CALLER **  : [4] : main;/home/jobprd/work/vplusprod.is.colostate.edu.20130713_123416/VPLUS_HIERARCHY.PL;

7188;vadmin_page_security::reset_hierarchy;1

 

I saw it out there too yesterday but error is fatal and needs to be checked out in more detail. DeletePageAccess fails for report HRMSMGMT.HRMSR001SS_01.

Gudrun.

 

Not sure if you check the hierarchy on the weekend now that it’s looking nice.

I saw it aborted and was not sure if you needed to be called. Its not critical, but it can leave people without access.

Rich.

 

This could have been restarted – I think databases are being recycled now, so best not to do it at the moment.

Saw the notes about HRMSVSTA.SSH_VPLUS_HIER_01.  Waited until ODBACYCP finished.

* Tried restarting SSH_VPLUS_HIER_01, but failed again trying to execute DeletePageAccess for report="HRMSMGMT.HRMSR001SS_01"

set="HRMS_0001-HRMSMGMT.HRMSR001SS_01".

* Tried restarting VPLUS app and restarted the job.  Failed again trying to execute DeletePageAccess for report="HRMSMGMT.HRMSR001SS_01"

set="HRMS_0004-HRMSMGMT.HRMSR001SS_01".

* Reindexing reports, then will try again.

Elden.

 

 

 

 

 

 

Aborted Module Name:   AREGTTRN.RWCLIENT_01

  Date:        Day:      Time:          Resolution:

07/29/13     Mon       09:30           Restarted by David.

 

Error log and follow up comments:

 

Mark,

RWCLIENT is failing again, do you see anything wrong with the reports server?

 

+ cat /ais02/log/AREGTTRN.RWCLIENT_01.11075565.SEND_MAIL_ERR.DAT

The Oracle base for ORACLE_HOME=/app/oracle/product/weblogic_ban_prod/as_1 is /app/oracle/product/weblogic_ban_prod

REP-0178: Reports Server rep11g_banprod cannot establish connection.

David.

 

Please see if (Note: 734293.1 - JDBC Connections from E-Business Suite Application Tier Fail with "java.sql.SQLException: Io exception: There is no process to read data written to a pipe) helps.

Got this from:   (scroll to bottom of it)

https://forums.oracle.com/thread/1083413

Our re-occurring error in the RMI log of AppMan is similar:

ErrorMsg: AwE-5001 Database Query Error (7/29/13 9:32 AM)

Details: getConnection         null

java.sql.SQLException: Io exception: There is no process to read data written to a pipe.

        at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)

        at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:179)

        at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:334)

        at oracle.jdbc.ttc7.TTC7Protocol.handleIOException(TTC7Protocol.java:3649)

Gudrun.

 

I’m looking at this…I’m in the middle of a training for the new Ellucian support center right now.  For the meantime, can you just keep retrying the report job until I get a chance to look into this…it’ll work eventually.

I don’t think this article pertains to the problem we’re having but thanks for looking.  That article is about Oracle Apps and all it says is to take out the timeout setting…I don’t want to change that setting because that will just cause things to hang when the connection issue is happening.  I am working with ACNS to see if we can spot anything odd.  It looks like things have calmed down now, so for the time being hopefully things are running ok.

Mark.

 

Could it be the AIX java we are using for our jdbc connections ?  Any compatibility issues that only surfaced now ?

Java we are using with AppMan:

java version "1.6.0"

Java(TM) SE Runtime Environment (build pap3260sr9fp2-20110627_03(SR9 FP2))

IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 AIX ppc-32 jvmap3260sr9-20110624_85526 (JIT enabled, AOT enabled)

J9VM - 20110624_085526

JIT  - r9_20101028_17488ifx17

GC   - 20101027_AA)

JCL  - 20110530_01

It appears there are reported connectivity issues for AIX 6.1 java out on the web.

Gudrun.

 

 

 

 

 

Aborted Module Name:   KFSXAM14.KFSXS030_01

  Date:        Day:      Time:          Resolution:

07/31/13     Wed       22:16           Restarted by Dermot.

 

Error log and follow up comments:

 

 

07/31/2013 22:30    DBARRETT

KFSXAM14.KFSXS030_01 / KFSXAM14_MONTH_END_CLOSE_TASKS ABORTED with the following error.

 

22:16:23 121  When period_not_closed Then

22:16:23 122  raise_application_error(-20000, '****

FATAL ERROR - Period '||v_close_fy||'-'||v_close_period||'

not Closed! ****');

22:16:23 123  End;

22:16:23 124  /

old   2: v_close_fy   number      := '&close_fy';

new   2: v_close_fy   number      := '2014';

old   3: v_close_period  VARCHAR2(2) := '&close_period';

new   3: v_close_period  VARCHAR2(2) := '01';

**** Start of KFSXS030 07/31/2013 22:16:23

ORA-20000: **** FATAL ERROR - Period 2014-01 not Closed!

ORA-06512: at line 122

 

Contacted Josh who put in a temporary fix, job completed successfully after a restart.

Dermot.

 

 

 

Aborted Module Name:   HRMSKFSA.HRMS_SPAWN_OUT_01

  Date:        Day:      Time:          Resolution:

08/15/13     Thu         17:43          Deleted by Steve. G.

 

Error log and follow up comments:

 

 

*** COPY SPAWNED CONCURRENT REQUEST OUTPUT TO JOB OUTPUT FILE

+ print *** \n*** OUTPUT FROM SPAWNED CONCURRENT REQUEST 7806192 (PARENT REQUEST 7806171): \n***

+ 1>> /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ cat /oraapps/hrprod/out/o7806192.out

+ 1>> /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out

+ read this_spawned_req

+ cut -f2 -d ?

+ grep C

+ print 7806193?C

C

+ + cut -f1 -d ?

+ print 7806193?C

spawned_req_no=7806193

+ spawned_consub_out=/oraapps/hrprod/out/o7806193.out

+ [[ -s /oraapps/hrprod/out/o7806193.out ]]

+ print *** \n*** COPY SPAWNED CONCURRENT REQUEST OUTPUT TO JOB OUTPUT FILE \n***

 

2013-08-15 17:43:04 Prompt 1 changed from "{#spawn_spool_{chain_id}}" to "/ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_FIND_01.spool.lis" by OSU=appworx JDBC Thin Client

2013-08-15 17:43:04 Prompt 2 changed from "{#workdat}/{module}.Spawned_Out" to "/ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out" by OSU=appworx JDBC Thin Client

 

Can you research the cause of this abort?  The problem seems to be with the concurrent request shown below.  In kebler, the file /ais01/dat/work/prod/HRMSKFSA.HRMS_SPAWN_OUT_01.Spawned_Out  should give you the error information.  We can’t proceed on the AppMan side until we get some guidance from you guys -- Thanks!

Steve G.

 

After discussing with Steve Hill, it was decided to delete this aborted component.  HRMSKFSA_KFS_ADJUSTMENTS is now complete.

Steve. G.

 

 

Aborted Module Name:  ADMSLETP.SPOOL_TO_PRINT_01

 

  Date:        Day:      Time:          Resolution:

08/15/13     Thu         20:03          Restarted by Joleen.

 

Error log and follow up comments:

 

 

I have attached the output file. PLEASE DO NOT RESTART-SEE NOTE BELOW

 

# -> [*******************************************************************************]

# -> [FATAL EXIT CALLED FROM [spool_filter::fatal]]

# -> [-------------------------------------------------------------------------------]

# -> [ERROR: file /ais01/spool/vipp/rnsr4.ini not found]

# -> [-------------------------------------------------------------------------------]

# -> [[ 2013.08.15-20:03:45 ]]

# -> [RETURN CODE = 100]

# -> [===============================================================================]

+ exit 100

+ err=100

+ [ 100 -eq 0 ]

+ [ 100 != 0 ]

+ status=ABORTD

 

NOTE:

I’m not sure who should get this message but thought all of you would be interested.  Admissions discussed the ADMSLETP job aborting and in the future we would prefer that it is not automatically restarted.  We would like to review this before we decide if the job should be rerun or not. 

 

ADMSLETP is a job that prints letters so I’m guessing that they are printing correctly now but we won’t see the final product until later.  But I do think this job is running ok now.

Marcella.

 

Marcella – I ran the script to move rnsr4 to the printer at 0620 on 08/06/2013.  I didn't receive an error message, but it appears the process didn't properly complete.  Do you want me to use the same project from 08/05/2013 or do you want to regenerate a VIPP project in _pending_printer_ ?

Elden.

 

I restarted ADMSLETP.SPOOL_TO_PRINT_01 and it has finished running.

Elden was in communication with Marcella and resolved the issue:

The rnsr4.ini file is now available.  Joleen, please restart the SPOOL_TO_PRINT in accordance with available documentation.

Joleen.

 

 

 

 

Aborted Module Name:   AREGFQTR.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

08/20/13     Tue         19:22          Restarted by Joleen.

 

Error log and follow up comments:

 

Would it be OK to try to restart? I don’t think we received any files, there aren’t any in the /ais01/ftp/from/user directory.

Do we need to see if we can manually connect to their server first or should we have Matt contact Scrip Safe to see if we need new Host keys?

Joleen.

#==============================================================================

# > @       WARNING: POSSIBLE DNS SPOOFING DETECTED!          @

# > The RSA host key for iwantmytranscript.com has changed,

# > and the key for the corresponding IP address 192.241.176.90

# > is unknown. This could either mean that

# > DNS SPOOFING is happening or the IP address for the host

# > and its host key have changed at the same time.

# > @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @

# > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

# > Someone could be eavesdropping on you right now (man-in-the-middle attack)!

# > It is also possible that a host key has just been changed.

# > The fingerprint for the RSA key sent by the remote host is

# > 65:bd:2e:c2:e5:d5:0f:48:4e:74:62:cd:3b:99:23:02.

# > Please contact your system administrator.

# > Add correct host key in /home/jobprd/.ssh/known_hosts to get rid of this message.

# > Offending RSA key in /home/jobprd/.ssh/known_hosts:51

# > RSA host key for iwantmytranscript.com has changed and you have requested strict checking.

# > Host key verification failed.

 

I think we ought to hold off on restarting the job until Matt can contact ScripSafe.  I'd also rather let Elden see this before we try to reconnect.
Rob.

 

I have emailed Scrip_Safe support letting them know we received that error message and have not tried connecting again…

Matt.

 

It is indeed important to verify with ScripSafe if they changed their environment.  This may be legitimate if they changed their server or upgraded SSH and didn't preserve the original server keys. If ScripSafe confirms this is correct, then we will need to delete the old server key from jobprd's known_hosts file and manually connect to save the new server key and verify the connection.

If they revert to the old server, then we may still wish to manually connect to confirm connectivity.

Elden.

 

Donna was able to get a hold of a support rep from ScripSafe.  They said they were performing server maintenance last night which caused our process to abort.  David has restarted the process and it appears to be running normally.  The printed transcripts should appear in the Registrar’s office shortly.  Please let me know if there are any further issues.

Rob.

 

Transcripts have been released to the printer.

David.

 

 

 

Aborted Module Name:   HRMSAM18.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

08/21/13     Wed        07:30          Restarted by Steve.

 

Error log and follow up comments:

 

The error was:

 

FATAL : Error opening address file (/ais01/dat/misc/mailst/SEND_MAIL.HRMSWKST.LST) : A file or directory in the path name does not exist.

 

The SEND_MAIL.HRMSWKST.LST had apparently been modified since the last time this ran, and had not been saved as a file with the same name.  We made a copy of the most recent version of the file, SEND_MAIL.HRMSWKST.LST.20120701_205125, and renamed it back to SEND_MAIL.HRMSWKST.LST, restarted the component and it finished successfully.

Steve. G.

 

 

 

 

Aborted Module Name:  KFSXSYPG.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

08/23/13     Fri          19:09           Restarted by Dermot.

 

Error log and follow up comments:

 

08/22/2013 20:26    DBARRETT

KFSXSYPG.KFSX_JAVA_01 / KFSXSYPG_SYS_PURGE ABORTED in java step "processPdpCancelsAndPaid"

with the following error message.

"2013-08-22 19:09:30,962 [main] ERROR

org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClie

nt ::

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#>

errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1 Remote Shell errtrap_ssh parm 2 value is 1"

Contacted Mike who checked into it & had me restart the step, it completed successfully.

 

It is okay to restart this step if future ABORTS with similar error message.

Dermot.

 

 

 

Aborted Module Name:   KFSXPDCH.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

09/03/13     Tue         09:16           See follow up below.

 

Error log and follow up comments:

 

 

KFSXPDCH.SEND_MAIL_01 /  KFSXPDCH_PDP_CHECKS_EXTR is in a STATUS of “FOLLOWUP_CHK”.

 

 

# - *** MULTIPLE pdp_check*.xml files were found

# - ***

# - Manual Cleanup/Processing Required -- Contact Information Systems

# - /ais02/app/kfs/prd/work/staging/pdp/paymentExtract/pdp_check_20130903_073831.xml

# - /ais02/app/kfs/prd/work/staging/pdp/paymentExtract/pdp_check_20130830_153133.xml

 

 

If KFSXPDCH finds multiple checks:

verify that bfs really want both files to process.

If so, then rename one of the files to end with .Z

 

/ais02/app/kfs/prd/work/staging/pdp/paymentExtract/pdp_check_20130903_073831.xml.Z

restart step after resetting condition 1 from DONE to ONCE (see below).

 

When the ABORTED chain completes then rename the file back (remove .Z)

/ais02/app/kfs/prd/work/staging/pdp/paymentExtract/pdp_check_20130903_073831.xml

 

Request in a stand alone KFSXPDCH with a HOLD, then change prompt value 2 to either “A.M” or “P.M” & parameter 3 to a “Y’.

 

Check location /ais02/app/kfs/prd/work/staging/pdp/paymentExtract as both files should no longer be there.

Dermot.

 

 

Aborted Module Name: FAIDNEED_EV.VPLUS_RCAPTURE_01

 

  Date:        Day:      Time:          Resolution:

09/03/13     Tue         12:04           See follow up below.

 

Error log and follow up comments:

 

a2ps: Memory exhausted

+ lf_RETURN_CODE=1

# INFO: /usr/bin/a2ps RETURN CODE = (1)

#-----------------------------------------------------------------------------#

# ERROR: Non-Zero RETURN CODE

#-----------------------------------------------------------------------------#

# Exiting /appworx/csu/exec/VPLUS_RCAPTURE.KSH with Return Code (100) #=============================================================================#

error is 100

 

Does anyone have any ideas on resolving this abort? I'm trying to capture the FAIDNEED_EV report into Vista Plus.

Joleen.

 

The problem seems to be trying to put the output into Vista...Is there any way to run this without doing that part so I can see if the main part of the job is working or at least see the output that it's trying to put into Vista plus?   This is part of a critical problem I'm trying to resolve, and I'm not really concerned with the VistaPlus portion right now.

Mark.

 

I'm getting errors opening or emailing 2 of the output files with the red x.

Joleen.

 

The file appears to be too big for our utilities to handle:

  FAIDNEED_EV.672942.txt has 23636 formfeeds / 23637 pages

and it is over 106 MB. We'll need to discuss this report.

Elden.

 

We noticed this also yesterday and Candy was able to create a smaller population for testing.

David.

 

It was just a test report that got too many people.  It would never run like this in production, so no need to spend time on it.

Mark.

 

 

 

 

Aborted Module Name:   ADMSSRLD_DY.SURLOAD_03

  Date:        Day:      Time:          Resolution:

09/05/13     Thu        22:30           Restarted by Joleen.

10/30/13     Wed       22:43           Restarted by Dermot.

 

Error log and follow up comments:

 

09/05/13.

05-SEP-2013 22:30:22                   Colorado State University        PAGE 1       999999        Communication Load                                                8.2

                                                       AUTOMATED GURMAIL LOAD

                                                            Load Errors

PIDM             Id                       Name                     System Indicator     Description/Comments

010496266   824112927     Crouse , Amy M                               S                    Not Loaded; Duplicate Non-printed letter

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated - parent key not found

WRN-ORACERR: Error occurred in file "surload.pc" at line 705

WRN-ERRSTMT: Following statement was last statement parsed:

    insert into GLBEXTR (glbextr_key,glbextr_application,glbextr_selection

surload terminated with error

11 lines written to /appworx/out/surload_3197444.lis

 

I've removed the extra duplicate for this record.  Can we start this job again?

Marcella.

 

10/30/13.

30-OCT-2013 22:43:48                                 Colorado State University                                         PAGE 1

PIDM      Id                            Name                     System Indicator     Description/Comments

011410856 830223808 Backstrom , Wade L                           S                   Not Loaded; Duplicate Non-printed letter

 

I’m researching this but I don’t see what is wrong with this student’s record.  Did someone remove anything from his SUAMAIL record?

Marcella.

It looks like he has AEML_EEIDU and ADMT_AFCLR letters without a print date.  I assumed the surload was trying to add one of these when it already existed???  When it aborted yesterday, I think Trish deleted the record from suamail that was causing the problem - but I don't know which letter it was.

Kathy.

After looking at the EEIDU code I think that is the issue. It is trying to add a duplicate letter if one hasn’t already been printed.  I need to update this code anyway so I’ll be reviewing it today and tomorrow to get it right.  In the meantime I’ve added a 1/1/13 date to this EEIDU record in SUAMAIL.  I’m hoping another one will get generated and printed so that is why I left that first one out there.

Marcella.

 

 

 

Aborted Module Name:   ODSRADMS.ODSRS003_01

  Date:        Day:      Time:          Resolution:

09/07/13     Sat          07:40           Restarted by Joleen.

 

Error log and follow up comments:

 

 

09/07/2013 08:07    JWEARNE

I received this page:

Sat Sep  7 07:40:24 MDT 2013

**CRITICAL AWPROD COMPONENT FAILURE**

ODSRADMS

ODSRADMS.ODSRS003_01 has aborted. I called the DBA cell.

Mark B will look into it.

 

09/07/2013 08:55    MBRITTON

Joleen called because an ODS job failed.  It was trying to run LOAD_CSUS_APPLICANT_FRZ.  The error was caused by the fact that the CSUS_APPLICANT view got modified this week but the same change did not get made to the CSUS_APPLICANT_FRZ table.  I modified the FIRST_PROGRAM_OF_STUDY column in the freeze table to be

VARCHAR2(63) instead of VARCHAR2(11).  A developer should take a look on Monday and make sure the size is right now, and fix the appropriate DDL script if applicable.  Joleen reran the AppMan job and it finished successfully.

 

 

 

Aborted Module Name:   AGENDYGN.AGENS008_01

  Date:        Day:      Time:          Resolution:

09/10/13     Tue         19:01           Restarted by Joleen.

12/03/13     Tue         19:02           Restarted by Joleen.

 

Error log and follow up comments:

 

09/10/13.

Others: PIDM : 11199075 add: ecrice@rams.colostate.edu email_type: P error is:

ORA-00001: unique constraint (GENERAL.UK_GORIROL) violated

ORA-06512: at

"BANINST1.DML_GORIROL", line 48

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 357

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 708

ORA-06512: at

"BANINST1.GB_INSTITUTION_ROLE", line 937

ORA-06512: at "BANINST1.ICSPKLDI", line

575

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

ORA-04088: error during

execution of trigger 'GENERAL.GT_GOREMAL_AS_LDI'

 

I think the problem for this person – 11199075 828739493  Rice, Emily Catherine Is that her EAPP email address (in GOREMAL) is mark as the preferred email address instead of her EID email address. Can someone change it so that the EID email address is the preferred email address and we will restart AGENS008?

Vicki.

 

I have made the change to make the EID preferred.

Jamie.

 

12/03/13.

ORA-00001: unique constraint (GENERAL.UK_GORIROL) violated

ORA-06512: at

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE",

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 708

ORA-06512: at

ORA-06512: at "BANINST1.ICSPKLDI", line

ORA-06512: at "GENERAL.GT_GOREMAL_AS_LDI", line 5

 

This person, libbyatw, is showing up twice in eids_admin_eiddata_email_01.

This is causing the problem.  Going into a meeting.  If someone figures out why and can correct the data, then we can restart the program.

Vicki.

 

Vicki and Erin fixed the problem record. I restarted the job and it has finished running.

Joleen.

 

 

 

 

 

 

Aborted Module Name: AGENWYWP.AGENS004_01

 

  Date:        Day:      Time:          Resolution:

09/18/13     Wed       19:02           Restarted by Joleen.

12/11/13     Wed       19:02           Restarted by Joleen.

 

Error log and follow up comments:

 

09/18/13.

CSU ID       Name of Person Purged

-25705448 -Audit Keller, Amanda

-30194863 -PurgeR Budgaga, Walid Saeed H

-30216619 -Purge Eremin, Alexander

Program ended unexpectedly at CSU ID -30216619 v_pidm 11409769 at marker 14 SQLCODE/SQLERRM =  -2292, ORA-02292: integrity constraint (CSUBAN.FK1_SWRLOGD_IN Persons Purged: 3

 

ORA-02292: integrity constraint (CSUBAN.FK1_SWRLOGD_INV_SWRLOGM_KEY) violated -

ORA-06512: at line 633

 

I restarted AGENWYWP.AGENS004_01 per Peter.

AGENWYWP_PIDM_PURGE has finished running.

A new process that was added yesterday (Table logging) and has been turned off until this program can be fixed. This was not a data problem. Tables associated with the table logging process should not have been picked up for processing by AGENS004.

Joleen.

 

12/11/13.

CSU ID       Name of Person Purged

-30049452 -PurgeR Rankin, Charise Alyssa

-30271450 -Purge Gross, Conner Jeffrey

-30322396 -Purge Liskey, Julia Faith

-30326484 -Purge Loire, Marguax

Program ended unexpectedly at CSU ID  v_pidm 11428667 at marker 15 SQLCODE/SQLERRM =  -6502, ORA-06502: PL/SQL: numeric or value error: character to number conversion error Persons Purged: 4

 

After the registrar's office fixed the SPRIDEN record associated with the 11428667 PIDM - the SPRIDEN_ID had a value of "-SPAIDEN" which seemed to cause the " character to number conversion error". I had the dba's do an update to the CSUG_PURGE_IDS table to correct the CSU ID value in that table so it matched the updated CSU ID in SPRIDEN. After that, Joleen was able to successfully run AGENS004.

Peter.

 

 

 

Aborted Module Name:  AREGSTFC_SP.AREGS400_01

  Date:        Day:      Time:          Resolution:

09/19/13     Thu       02:47            Restarted by Joleen.

 

Error log and follow up comments:

 

ORA-20501: *** TIME SLOT Out of Order for Priority 60000

ORA-06512: at line 104

 

02:47:03 103 

02:47:03 104                 raise_application_error(-20501,'*** TIME SLOT Out of Order for Priority ' ||

02:47:03 105                                         slot_row.sfrwctl_priority);

02:47:03 106             elsif slot_row.sfrwctl_begin_date = v_sfrwctl_rec.sfrwctl_end_date then

02:47:03 107                 if slot_row.sfrwctl_hour_begin < v_sfrwctl_rec.sfrwctl_hour_end then

02:47:03 108                     utl_file.put_line(file_handle, ' *** TIME SLOT Out of Order for Priority ' ||

 

We're trying to figure out the issue with AREGS400.  And Vicki isn't here so we're grasping at straws to figure how this entire process works.  :-)

Is SFRWCTL updated manually?  It looks like there is some date overlap if you compare 201390 to 201410.  See screen shots below.  I think the data overlap for 201410 - all end dates of 5/9/2014 - might be causing the problem.  Any info you have that can enlighten us is much appreciated!  :-)

Kathy.

 

Inline image 1

 

Inline image 2

 

I talked to Jamie and Sue and they corrected the dates in the tables and asked us to rerun the job.  

I restarted AREGSTFC_SP.AREGS400_01.

AREGSTFC_TIME_TICKET_CONT for Spring has finished running.

Joleen.

 

 

Aborted Module Name:   FAIDTRAK_OD.LYNX_02

  Date:        Day:      Time:          Resolution:

09/24/13     Tue        05:41            Restarted by Joleen.

 

Error log and follow up comments:

 

 

[OracleException (0x80004005): ORA-12571: TNS:packet writer failure]

 

Can we try re running these first? This looks like a network hiccup.

Zach.

 

FAIDSUMR.LYNX_01 was restarted and finished successfully -- I then restarted FAIDTRAK_OD.LYNX_02 and it also finished successfully.

Joleen.

 

 

Aborted Module Name:   FAIDLORC_EV.LYNX_03

  Date:        Day:      Time:          Resolution:

09/27/13     Fri          15:21           Restarted by Joleen.

 

Error log and follow up comments:

 

 

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsnet.colostate.edu

Making HTTP connection to wsnet.colostate.edu

Sending HTTP request.

HTTP request sent; waiting for response.

Alert!: Unexpected network read error; connection aborted.

Can't Access `http://wsnet.colostate.edu/cwis231/autorun/disb_discrepency.aspx?ay=1314'

Alert!: Unable to access document.

lynx: Can't access startfile

David.

 

 

 

Aborted Module Name:  AGENDYGN.AGENS006_01  

  Date:        Day:      Time:          Resolution:

09/27/13     Fri          19:01           Restarted by Joleen.

 

Error log and follow up comments:

 

Can anyone fix this abort? It looks like a data problem. System Support is trying to put a patch on AppMan production on Sunday. They need the schedule to be done with no aborts.

-27-SEP-2013 19:02:24                                 Colorado State University

                                  Add holds for persons with ended mailing addre

PIDM     HTYP From Date Reason                                               

---------------------------------------                                      

10116834 DX   28-SEP-13 Address update needed                                

11304957 DX   28-SEP-13 Address update needed                                

11305301 DX   28-SEP-13 Address update needed                                

11319346 DX   28-SEP-13 Address update needed                                

Error: ORA-20100: ::Hold from date must be less than or equal to hold to date::

 

you'll probably need to call someone about this in order to get a response.  I agree with your earlier assessment that it is a data problem.  That means you need a developer or user to look at it.  If data needs to be updated, you can call the DBA phone but someone else will need to determine what that update should be before you make that call.  I'm not sure who the appropriate person to call would be, maybe Vicki or Phil?

Mark. B.

I was able to contact Vicki and she instructed me to skip AGENS006 and let the rest of AGENDYGN run. WE will need to resolve the AGENS006 error on Monday.

David.

 

11407304             830198573           Reis, Daniel Cristian

11407304             MA         2              27-SEP-2013       31-DEC-2099       2925 Tumbleweed Lane

  for address in ended_address_without_hold_c loop

    put_report_line(rpad(address.spraddr_pidm, 9) || 'DX   ' ||

                    rpad(address.spraddr_to_date + 1, 10) ||

                    'Address update needed');

    gb_hold.p_create(p_pidm        => address.spraddr_pidm,

                     p_hldd_code   => 'DX',

                     p_user        => user,

                     p_from_date   => address.spraddr_to_date + 1,

                     p_to_date     => '31-DEC-2099',

                     p_release_ind => 'N',

                     p_reason      => 'Address update needed',

                     p_data_origin => PROGRAM_NAME,

                     p_rowid_out   => rowid_out);

I believe the error is in regards to this person’s mailing address ending in 31-DEC-2099. 

Because AGENS006 adds a DX Hold with a ‘from date’ that is one day after the ‘to date’ of ’31-DEC-2099’.

Please fix the ‘to date’ on this Mailing Address to something much closer in time to today’s date (whatever is correct).

Vicki.

 

I removed the ‘to date’ for this students mailing address. It appears someone updated the student’s mailing address on their behalf and mistakenly entered a ‘to date’. I’ll reach out to that person.

Jamie.

 

 

Aborted Module Name:   APMXRMIE.APMXRMIE_01

  Date:        Day:      Time:          Resolution:

09/30/13     Mon        07:31           Restarted by David.

 

Error log and follow up comments:

 

cat: 0652-050 Cannot open apmxrmie_files_processed.log.

Mon Sep 30 07:30:10 MDT 2013

Processing file RmiServer1309290835.log

Found ErrorMsg

Mon Sep 30 07:30:14 MDT 2013

Processing file RmiServer1309290840.log

Found ErrorMsg

Found ErrorMsg

Mon Sep 30 07:30:18 MDT 2013

Processing file RmiServer1309290855.log

Found ErrorMsg

Mon Sep 30 07:30:21 MDT 2013

 

BANNERLOG=/appworx/out/APMXRMIE.APMXRMIE_01.11562923.11562924.00..log

+ [[ -s /appworx/out/APMXRMIE.APMXRMIE_01.11562923.11562924.00..log ]]

+ print *** \n*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:\n***

+ 1> /ais01/dat/work/prod/APMXRMIE.APMXRMIE_01_jobstat

+ egrep -v -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog_exceptions

+ 1>> /ais01/dat/work/prod/APMXRMIE.APMXRMIE_01_jobstat

+ egrep -i -f /ais01/dat/misc/prod/errstrg_appworx_joblog /appworx/out/APMXRMIE.APMXRMIE_01.11562923.11562924.00.2013_09_30_0730.AWPROD.LOG

+ print *** \n*** END SEARCH OF JOBLOG FOR ERROR STRINGS \n***

+ 1>> /ais01/dat/work/prod/APMXRMIE.APMXRMIE_01_jobstat

+ cat /ais01/dat/work/prod/APMXRMIE.APMXRMIE_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open apmxrmie_files_processed.log.

 

I checked /appworx/log and found that apmxrmie_files_processed.log existed, so I re-started and it completed successfully.

David.

 

 

Aborted Module Name:   AROSFRQ1.TGRAPPL_01

  Date:        Day:      Time:          Resolution:

10/25/13     Fri          15:00           See note below.

 

 

Error log and follow up comments:

 

 

FYI.

There is a new process flow, AROSRSET_TGRAPPL_RESET_FLAG. This PF is used to reset the flag in the GJBPRUN table on the rare occasions where the AROSFRQ1.TGRAPPL_01 fails and cannot be re-started until the flag in the GJBPRUN table has been reset.

The AROSS157.sql was written by Rob to accomplish this,(rather than having to have a DBA do this)

David.

 

 

 

Aborted Module Name: AREGTTRN.SSH_SFTP_RC_03 

  Date:        Day:      Time:          Resolution:

10/21/13     Fri          00:22           See note below.

 

Error log and follow up comments:

 

 

There are files waiting to be processed in this directory:

/ais01/ftp/from/user/AREGTTRN.colora-88-20131021053009.txt

 

# > ssh: connect to host iwantmytranscript.com port 22: Connection timed out # > Connection closed # > (255) #==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

I was just able to manually connect, so you should be able to restart following any restart instructions you have for this job.

Elden.

 

I logged on to the transcripts web site and I could see the file in question still there in the 'awaiting-process' directory. I re-started the failed job and it picked up the file.

The Transcript process is now continuing.

David.

 

 

 

 

Aborted Module Name:   FAIDALCT.SSH_SFTP_RN_01

  Date:        Day:      Time:          Resolution:

11/07/13     Thu         07:04          See note below.

 

Error log and follow up comments:

 

 

FAIDALCT.SSH_SFTP_RN_01 / SFTP_FILSEND is in ABORT status on AWPROD.

 

# > Permission denied (password,gssapi-with-mic).

 

I restarted this and it finished…

Steve G.

 

 

 

 

Aborted Module Name:  FAIDRESD.FTP-SFS-LOOP_01

  Date:        Day:      Time:          Resolution:

11/07/13     Thu         21:16          Restarted by David.

 

Error log and follow up comments:

 

+ grep 11846531

+ egrep ABORTED|CRITFAIL|C-Error

     11846531.00 BATCH     FAIDRESD.FTP_AIS01_S11/07 21:19 00:03:10 ABORTED                FAIDRESD_RESIDENCY_LETTERS

+ print Failure in spawned FTP_AIS01_SFS - abort this module

Failure in spawned FTP_AIS01_SFS - abort this module

+ exit 1

 

# put    + : Opening ASCII mode data connection for sfs\FAIDRESD.GLRLETR_01.11843792.RDIE.doc.

#******************************************************************************

# FATAL : Unexpected error - undefined line returned

#------------------------------------------------------------------------------

# USAGE: /appworx/csu/exec/FTP_ENHANCED.PL \

#     remote_host=override_host_name\

#     transfer_mode=transfer_command\

#     translate=translation_mode\

#     src_file=fully_qualified_source_file\

#     dst_file=fully_qualified_destination_file\

#     site_options=comma_delimited_site_options\

#     local_options=semicolon_delimited_local_options\

#   transfer_mode values

#     append, dir, get, put, recv, send, submit

#   translate values:

#     ascii, binary, ebcdic

#   site_options values

#     comma delimited site options

#       RECFM=FBA,LRECL=133,BLKSIZE=3325

#   local_options values

#     semicolon delimited local options

#       active | passive | cd=remote_dir

#   Also, these environment variable must be set

#     net_connect, full_login, db_password

#------------------------------------------------------------------------------

# exit     : [ 2013.11.07-21:19:46 ] -- RETURN CODE = 100

 

I re-started and it completed.

David.

 

 

Aborted Module Name:   CLMSDATA.SSH_EXEC_01

  Date:        Day:      Time:          Resolution:

11/20/13     Wed       05:56          Restarted by Joleen.

 

Error log and follow up comments:

 

# 2013.11.20-06:06:45 : > Error: 2013-11-20 06:06:45.43

# 2013.11.20-06:06:45 : >    Code: 0xC002F304

# 2013.11.20-06:06:45 : >    Source: email BannerUpdateResults Send Mail Task

# 2013.11.20-06:06:45 : >    Description: An error occurred with the following error message: "Mailbox unavailable. The server response was: 5.1.1 <valerie.monahan@colostate.edu>... User unknown".

# 2013.11.20-06:06:45 : > End Error

 

Good Morning Chrystal and Steven,

Do you have any information on Valerie Monahan? The CLMS job is trying to send her an email and it is failing.

Joleen.

 

Sorry I didn’t think to mention it, but Valerie is no longer with the University.  You can remove her email address from any other lists she might have been on.  We are still waiting to hear if the position will be refilled, or if there will be some restructuring, so there isn’t an address that should be substituted at this time. 

Chrystal.

 

Here are some notes we have when this abort happened before. The email address will need to be fixed over in your office.

Here is the command that is being remotely executed once ssh connection has been established. We can’t make any changes to those files.

cmd.exe /V:ON /C

"D:\Program Files\SCT\clm\DtsxPackages\dtexec_wrapper.cmd"

"file=D:\Program Files\SCT\clm\DtsxPackages\BannerUpdateDriver.dtsx"

"config=C:\Users\sshuser\.ssh2\BannerUpdateDriver.cfg"

 

Comment from Steven Dove from the prior abort:

“I’ve updated the email list in the DTSX package and copied to the CLM server.”

Joleen.

 

I’ve updated the DTSX package to use email group bfs_clm_ftp@mail.colostate.edu so there are no more hard coded recipients.  I’ve copied the updated DTSX package to the CLM server.

Steven Dove.

 

 

 

 

 

Aborted Module Name:  CLMSDATA.SSH_EXEC_01

  Date:        Day:      Time:          Resolution:

11/20/13     Wed       11:20           Restarted by Joleen.

 

Error log and follow up comments:

 

# 2013.11.20-10:21:46 : >    Description: Package migration from version 6 to version 2 failed with error 0xC001700A "The version number in the package is not valid. The version number cannot be greater than current version number.".

# 2013.11.20-10:21:46 : > End Error

# 2013.11.20-10:21:46 : > Error: 2013-11-20 10:21:46.06

# 2013.11.20-10:21:46 : >    Code: 0xC0010018

# 2013.11.20-10:21:46 : >    Source: 

# 2013.11.20-10:21:46 : >    Description: Error loading value "<DTS:Property xmlns:DTS="www.microsoft.com/SqlServer/Dts" DTS:Name="PackageFormatVersion">6</DTS:Property>" from node "DTS:Property".

I’ve been able to run the job locally so if this is holding up the schedule we can skip this job today.  I reached out to Lance Baatz this morning since he helped set up the execution scripts to see if he has thoughts on why the updated job fails the execution command.

~Steven.

We run the command under the sshuser account.  Here's the general environment information from a recent connection:

C:\Users\sshuser>set

ALLUSERSPROFILE=C:\ProgramData

APPDATA=C:\Users\sshuser\AppData\Roaming

CommonProgramFiles=C:\Program Files\Common Files

COMPUTERNAME=CLM1

ComSpec=C:\Windows\system32\cmd.exe

FP_NO_HOST_CHECK=NO

lib=C:\Program Files\SQLXML 4.0\bin\

LOCALAPPDATA=C:\Users\sshuser\AppData\Local

NUMBER_OF_PROCESSORS=8

OS=Windows_NT

Path=d:\oracle\product\11.1.0\client_1\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;c:\util;C:\Program Files\Microsoft SQL Server\80\Tools\Binn\;C:\Program Files\Microsoft SQL Server\90\DTS\Binn\;C:\Program Files\Microsoft SQL Server\90\Tools\binn\;C:\Program Files\Microsoft SQL Server\90\Tools\Binn\VSShell\Common7\IDE\;C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\PrivateAssemblies\;C:\Program Files\SSH Communications Security\SSH Tectia\SSH Tectia AUX;C:\Program Files\SSH Communications Security\SSH Tectia\SSH Tectia AUX\Support binaries;C:\Program Files\System Center Operations Manager 2007\;C:\Windows\System32\WindowsPowerShell\v1.0\

PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC

PROCESSOR_ARCHITECTURE=x86

PROCESSOR_IDENTIFIER=x86 Family 6 Model 23 Stepping 10, GenuineIntel

PROCESSOR_LEVEL=6

PROCESSOR_REVISION=170a

ProgramData=C:\ProgramData

ProgramFiles=C:\Program Files

PROMPT=$P$G

PSModulePath=C:\Windows\system32\WindowsPowerShell\v1.0\Modules\

PUBLIC=C:\Users\Public

SystemDrive=C:

SystemRoot=C:\Windows

TEMP=C:\Users\sshuser\AppData\Local\Temp

TMP=C:\Users\sshuser\AppData\Local\Temp

USERDOMAIN=CLM1

USERNAME=sshuser

USERPROFILE=C:\Users\sshuser

windir=C:\Windows

TERM=xterm

SSH_SESSION_ID=26

Here's the actual command we run to execute the package:

dtexec  -FILE "D:\Program Files\SCT\clm\DtsxPackages\BannerUpdateDriver.dtsx" -DECRYPT "xxxxxxxx"

Is this the same command you used to test the package?

Since the error is indicating a version issue, do we need to have different libraries, paths, etc?  If we do, then it seems that any other packages we execute will have problems with the new libraries unless all the packages are compiled in the newer environment.

I wonder if there is a compatibility option when compiling that will allow you to modify the version number.

Elden.

I ran the package straight from Visual Studio 2010 and not from a command line.

I’ve done some reading and it looks like VS2010 will not make a DTSX package that is usable by an older version of SQL Server.  It looks like CLM is still SQL Server version 2005 so we’re looking for an old machine around to install VS2005 to fix the current problem.

Lance Baatz was able to make the changes we needed using BIDS on the CLM server.  I just saw that the updates ran from our schedule (see attachment) so I think we are good now.

~Steven.

 

 

 

Aborted Module Name:   APMXRMNT.APMXS001_01

  Date:        Day:      Time:          Resolution:

11/22/13     Fri          12:55           Restarted by Robin.

 

Error log and follow up comments:

 

 

ROM AWMAINT.CSU_APPMAN_JOB_HIST A

              *

ERROR at line 72:

ORA-06550: line 72, column 15:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 42, column 2:

PL/SQL: SQL Statement ignored

ORA-06550: line 41, column 9:

PLS-00341: declaration of cursor 'R2_CUR' is incomplete or malformed

ORA-06550: line 76, column 2:

PL/SQL: Item ignored

ORA-06550: line 82, column 47:

PL/SQL: ORA-00942: table or view does not exist

 

I changed the login for APMXS001 in backlog to awmaint (It was appworx) and it is now processing.

 

Steve has updated the login for APMXRMNT.APMXS001_01 to awmaint (It was not set to any login, therefore defaulted to appworx)

 

FYI, The APMXRMNT.APMXS001_01 has been running for 1 hour 45 mins. I noticed that we normally don't run this, but APMXS001 must have been added since last week? Since the SQL has commits in it I will just let it continue to run for now.

David.

 

Usually I let the job run separately. I need to get with the DBAs about slow inserts and updates into table CSU_APPMAN_CHAIN_FACT. Issue does not exist with the CSU_APPMAN_JOB_FACT table even so larger in size.  I will separate job and chain inserts for now. Only job data needed for those reports.

Gudrun.

 

 

 

 

Aborted Module Name: AREGDYGN.AGENS017_01   

  Date:        Day:      Time:          Resolution:

12/03/13     Tue        20:32           Restarted by Joleen.

 

Error log and follow up comments:

 

From: jobprd@mailer.is.colostate.edu [jobprd@kebler.is.colostate.edu]

Sent: Tuesday, December 03, 2013 8:36 PM

To: IS DL: Alert APMX

Cc: 9705815577@tmomail.net

Subject: AWPROD APMXCHKS Abort Job Backlog Warning

 

Tue Dec 03 20:35:20 MST 2013                                                                                                    Page 1

                                            Check Backlog for ABORTED jobs (so_status  202)

Job                  Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

-------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

AREGDYGN.AGENS017_01 12019935 12-03-2013 20:32:14 MST    202 ABORTED              544.67                    182                     33

 

The file had a _ instead of a .

I renamed the file and restarted AREGDYGN.AGENS017_01.

It has finished running.

Joleen.

 

 

 

Aborted Module Name: AGENDYGN.AGENS008_01 

  Date:        Day:      Time:          Resolution:

12/31/13     Tue        19:01           Restarted by Joleen.

01/07/14     Tue        19:01           Restarted by Joleen.

 

Error log and follow up comments:

12/31/13.

Others: PIDM : 10640237 add: katie@jmcollins.org email_type: P error is:

 

ORA-00001: unique constraint (GENERAL.UK_GORIROL) violated

ORA-06512: at

"BANINST1.DML_GORIROL", line 48

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 357

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 708

 

AGENS008 did complete successfully. There were two data elements that were changed and I can't say if both were required for successful processing but here’s what was done:

·         In GOREMAL, Jamie did change the PREFERRED_IND field so that the record with the “EID” EMAL_CODE was set to be the preferred email - previously the record with the “EAPP” EMAL_CODE was the preferred email

·         In the EIDS_EMAIL_00 table, Randy did update her EMAIL_ADDRESS since it had what is considered an incorrect email address for the eid. It was katie@jmcollins.org but should have been her rams email address, and Randy changed it to: kmc318@rams.colostate.edu.

Peter.

 

01/07/14.

ORA-00001: unique constraint (GENERAL.UK_GORIROL) violated

ORA-06512: at

"BANINST1.DML_GORIROL", line 48

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 357

ORA-06512: at "BANINST1.GB_INSTITUTION_ROLE", line 708

 

It appears to be the same problem for 11354524               829827335           Clements, Mercedes Alivia

Her EAPP is the preferred email address.

Jamie can you please make the EID email address the preferred email address for Mercedes?

Then Joleen you can try to rerun the program and we will find out if Randy will or will not have to do his piece.

 

PIDM

EMAL_CODE

Email Address

Status

Preferred Ind

11354524

EAPP

Merc8des@hotmail.com

A

Y

11354524

EID

merc8des@rams.colostate.edu

A

N

11354524

P1

cedesalexis@hotmail.com

A

N

 

Vickie.

 

I have changed the indicator to preferred on the EID email address and removed it from the EAPP email address.

Jamie.

 

 

 

Aborted Module Name:   HRMSS229.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

01/03/14     Fri          20:43           Restarted by Steve.

 

Error log and follow up comments:

 

# > Permission denied (password,publickey,keyboard-interactive).

 

I manually tried connecting to the server tonight and it failed.

Can someone check with DeltaDentalCO to see if they know of any issues?

When we try to connect, it tries to offer the key for about a minute, then fails.

Elden.

 

HRMSS229.SSH_SFTP_01 is still sitting with ABORTED status in AWPROD backlog.

Has anyone checked on this as Elden suggested ? Who has contact information for DeltaDentalCO ?

Gudrun.

 

Jennifer said she was working on it.

Vickie S.

 

At the moment the only thing that prevents the HRMS schedule from last Friday to complete is HRMSS229_DELTA_DENTAL.

Unless you foresee the issue to be resolved prior to 3pm would we have permission for today only to complete the HRMS schedule by removing its dependency on the completion of process flow HRMS229 ?

Haven’t heard anything about HRMSS229 unless you two have. Whole communication rather sparse.

Can you and Dawn check on this ?   We need to complete the HRMS schedule unless it should be postponed which I don’t think should be necessary by Vicki and/or Jennifer should be ok with that. If not them one of the HR developers could provide feedback ?

Gudrun.

 

I spoke to Bob V.  He said the schedule can proceed without it.  Steve plans to remove that dependency soon.  He is just waiting until the time for staging is closer to give them a chance to resolve the abort.

Dawn.

 

I will take care of getting HRMSAM99_HRMS_SCHEDULE_DONE from Friday’s schedule to complete, so that tonight’s HRMS schedule will not be delayed.  Even if HRMSS229 isn’t fixed today, it shouldn’t affect tonight’s schedule.

Steve. G.

 

Do you know if anybody has tried to send the file to Delta today?  If not and you need to remove the dependency to complete the process today, Jennifer said that that would be fine.

Vivian.

 

I tried restarting the failed component this morning and it failed again with the same connection error.  I can try it again, and if it fails again I’ll just force HRMSAM99_HRMS_SCHEDULE_DONE to complete. 

Well, I restarted HRMSS229.SSH_SFTP_01 and it finished this time!  HRMSAM99_HRMS_SCHEDULE_DONE also finished, so no other intervention was required.  Can we verify that the file was received by Delta?

Steve. G.

 

Seems like Greg got his backups maybe sorted too. Only one thing left maybe if needed to check out ….David might know more about that one.

What about last Friday’s ADMSBDMS_F3_SLATE_BDMS_FTP  ....is it ok to have that one running while a new ADMS schedule starts …?

Gudrun.

 

Joe Volesky had to restart the process on the BDMS server this morning. Since then ADMSBDMS_F3 has been processing the remaining 2,000+ files. It is currently down to about 300 left and should be done by about 4:15pm. No problems with tonight’s ADMS processing.

David.

 

I just received verification from Delta Dental that they did receive the file and are in the process of uploading it. 

They also verified that their systems were down all weekend and were not restored until late this morning.

Jennifer.

 

 

 

Aborted Module Name:   AREGDYCR.AREGS304_01

  Date:        Day:      Time:          Resolution:

01/10/14     Fri          08:51           Restarted by Joleen.

 

Error log and follow up comments:

 

AREGDYCR.AREGS304_01 has ABORTED. Please see the attached output file. There are 2 utl files located at /orautl/BANPROD that might be helpful?

 

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 714

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 805

 

08:51:47 713        BEGIN

08:51:47 714          SELECT 'First Generation'

08:51:47 715            INTO v_first_gen

08:51:47 716            FROM swrgpcd g

08:51:47 717           WHERE g.swrgpcd_gpcd_code = 'FIRSTGEN'

08:51:47 718             AND g.swrgpcd_pidm = main_rec.pidm;

08:51:47 719        EXCEPTION

08:51:47 720          WHEN NO_DATA_FOUND

08:51:47 721          THEN

08:51:47 722            v_first_gen   := 'Not First Generation';

08:51:47 723        END;

 

08:51:47 798  EXCEPTION

08:51:47 799    WHEN OTHERS

08:51:47 800    THEN

08:51:47 801      DBMS_OUTPUT.put_line (

08:51:47 802        'Program ended unexpectedly for CSU ID: ' || v_csuid || ', PIDM: ' || v_pidm);

08:51:47 803      DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);

08:51:47 804      DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_backtrace);

08:51:47 805      RAISE;

08:51:47 806  END;

 

Kathy found two students in the system with two FIRSTGEN data codes. We had Lorelei fix the records for us. I restarted and AREGDYCR_CONFLICT_RESOLUTION has finished running.

Joleen.

 

 

 

 

Aborted Module Name:   AREGFQTR.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

01/12/14     Sun         22:24           Restarted by Joleen.

 

Error log and follow up comments:

 

 

I restarted AREGFQTR.SSH_SFTP_01 and it finished running. The job had timed out and the driver was empty.

 

# > ssh: connect to host iwantmytranscript.com port 22: Connection timed out

# > Connection closed

# > (255)

 

 


From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]
Sent: Sunday, January 12, 2014 10:36 PM
To: IS DL: Alert APMX
Cc: 9705815577@tmomail.net
Subject: AWPROD APMXCHKS Abort Job Backlog Warning

 

Sun Jan 12 22:35:20 MST 2014                                                                                                    Page 1

                                            Check Backlog for ABORTED jobs (so_status  202)                                          

Job                  Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

-------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

AREGFQTR.SSH_SFTP_01 12282431 01-12-2014 22:24:09 MST    202 ABORTED             6119.49                    665                     11

 

 

 

Aborted Module Name: KFSXFPAA.KFSX_JAVA_01 

                                                                                                                     

  Date:        Day:      Time:          Resolution:

01/13/14     Mon       19:26           Resubmitted by Dermot.

 

 

Error log and follow up comments:

 

 

KFSXFPAA.KFSX_JAVA_01 / “procurementCardAutoApproveDocumentsStep” / The above job failed last night, it canceled itself & did not hold up the nightly schedule, below is the Appman error received..

 

2014-01-13 19:26:08,681 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kew.exception.WorkflowServiceErrorException: Document Search Validation Errors []

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

Dermot.

 

Looks like it was trying to do a doc search of procurement cards with a status of ‘R’.  

The lack of a validation error (in the brackets) makes it pretty difficult to determine what was going on. 

Mike.

 

Please run the FPAA again in production.

There are 71 Pcard documents that need to get processed.

Josh.

 

After speaking with Mike, if this error occurs again then it was decided that it is okay to resubmit the job.

The night this job ABORTED we had some DB ERRORS on APPMAN & may be related to some kind of system blip, no other explanation, job was resubmitted next morning & completed successfully!

Dermot.

 

 

 

 

Aborted Module Name: ADMSSRLD_DY.SURLOAD_05 

  Date:        Day:      Time:          Resolution:

01/16/14     Thu         22:51          Restarted by Joleen.

 

Error log and follow up comments:

 

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated - parent key not found

WRN-ORACERR: Error occurred in file "surload.pc" at line 705

WRN-ERRSTMT: Following statement was last statement parsed:

    insert into GLBEXTR (glbextr_key,glbextr_application,glbextr_selection

surload terminated with error

11 lines written to /appworx/out/surload_3304661.lis

                                                       AUTOMATED GURMAIL LOAD

                                                            Load Errors

 

PIDM      Id                            Name                     System Indicator     Description/Comments

011218524 828875678 Flinkstrom , Rachel L                        S                    Not Loaded; Duplicate Non-printed letter

 

Kathy contacted Trish and asked her to remove the duplicate record and asked me to restart the job.

ADMSSRLD_DY.SURLOAD_05 has finished running.

Joleen.

 

 

 

 

Aborted Module Name: FAIDDYNT_OD.GLBLSEL-GLRLETR_01

  Date:        Day:      Time:          Resolution:

01/16/14     Thu         22:51           Restarted by Joleen.

 

Error log and follow up comments:

 

 

It looks like I don't have a driver for FAIDDYNT_OD.GLBLSEL-GLRLETR_01. GLBLSEL-GLRLETR is a component we added to FAIDDYNT in April. I have a driver for the EV schedule which contains: R_DYNT_RVIME. Is it OK to create the OD schedule driver using the same R_DYNT_RVIME value?

 

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

/appworx/csu/exec/GLBLSEL-GLRLETR.KSH[246]: /userfiles/Ufaid/data/FAIDDYNT_OD.GLBLSEL-GLRLETR_01.DAT: cannot open

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

Joleen.

 

FAIDDYNT_OD is complete, but it looks like it did NOT select any letters.

 

I believe I have this bug fixed, so FTP-SFS-LOOP.KSH should be able to handle when a driver file doesn't exist.

David.

 

 

 

 

 

Aborted Module Name:  FAIDCFEX_SP.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

01/23/14     Thu         07:26          Restarted by Joleen.

04/17/14     Thu         09:37          Restarted by David.

 

Error log and follow up comments:

 

01/23/14.

 

We have tried restarting once and it aborted again.

 

# > ssh: connect to host ftp.college-assist.org port 22: Connection refused

 

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# > Connection closed

# RETURN CODE = 100

 

The state is having server issues.  I will let you know when they are fixed and you can send the file again.

Vanessa.

 

04/17/14.

 

# FATAL : Command failed with code : 255

 

We are not able to connect to cofcsu@ftp.college-assist.org Is their server up?

David.

 

I spoke with Peggy Hill @ COF.  She is checking their server & will get back with me.

Vanessa.

 

I was able to connect manually, so I re-started FAIDCFEX_SP_SSH_SFTP_01 and it completed successfully.

David.

 

 

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS002_01

  Date:        Day:      Time:          Resolution:

01/24/14     Fri         02:57           Deleted by Joleen.

01/28/14     Tue        02:30          Deleted by Joleen.

 

Error log and follow up comments:

 

01/24/14 .   

ERROR at line 1:

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 27 with name

"_SYSSMU27_1294187699$" too small

ORA-02063: preceding line from BANPROD@ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

ORA-06512: at line 87

 

It was CSUBAN.CSUS_TEST_SCORES_MV again.

I think we should abort the process and let the schedule finish…any objections?

Mark. P.

 

Should we run the next step which sends this notify to the IR server?  - “odsrefresh_odbaodsp_control.dat”

Joleen.

 

Yes…………………………Mark. P.

 

I deleted ODSRAGEN.ODSRS002_01 and let the rest of the job finish out, which included the notify to IR.

Joleen.

 

Erin took over this view and I don't think she is here today.  I think it's ready to go - just waiting for end users to give the okay.  

Kathy B.

 

01/28/14.    

ORA-12008: error in materialized view refresh path

ORA-01555: snapshot too old: rollback segment number 17 with name "_SYSSMU17_1294186495$" too small

ORA-02063: preceding line from BANPROD@ODS_USER

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

ORA-06512: at line 87

 

The csuban.csus_test_scores_mv was the offender again.

Please abort the job and allow the schedule to finish.

Vicki...any chance we can get the new version of this in ODS Prod today?

Mark. P.

 

 

 

 

Aborted Module Name:   AREGDRGC_SP.WAIT_FOR_DARS_01

  Date:        Day:      Time:          Resolution:

01/24/14     Fri          00:08          No output File - Deleted per Vickie.

 

Error log and follow up comments:

Object Code error : file 'bldreqt'                                                                                           

error code: 153, pc=0, call=1, seg=0                                                                                      

153     Subscript out of range (in bldreqt.cob, line 2623

Processing ended with this student:

PREPARED: 01/24/14 - 00:08                          828868685   

Dicesare,Carly Lynn                                             

PROGRAM CODE: BCHM-HMSZ-BS               CATALOG YEAR: 201090            

                       BIOCHEMISTRY MAJOR            HEALTH AND MEDICAL SCIENCES CONCENTRATION

Can you confirm that the DARS (Darwin) DAEMON is running?

Vicki.

Yes, the daemon is running.  The error listed is from the Cobol program so it obviously got started.   You might run that error by Mike Taylor and see if he has any insight.

Mark. B.

The status of the batch is ‘E’, which indicates an error of some kind.

David.

Yes, it looks an audit or some of the audits in the batch may have errored, rather than there being anything wrong with the daemon itself. Mark, if you could help me get access to the production DARS logs, I can look into them further. However, the error snippet that Joleen included does show an error at bldreqt.cob, which I believe is likely part of the DARS STUINST module. I’ve copied Mike Taylor on this email to bring him into the loop.

Mike – do you think you could help troubleshoot if there’s an error resulting from the STUINST?........................ZACH.

I opened up the permissions on the log directory so hopefully you can see them now.   They are on Kebler in /app/dars/darsprod/dars35/bin/temp.  Let me know if you can’t get to them and I’ll just copy them off somewhere.

Mark. B.

It appears that a subscript out of range in program bldreqt.cob.

Object Code error : file 'bldreqt' error code: 153, pc=0, call=1, seg=0

153     Subscript out of range (in bldreqt.cob, line 2623)……………………..Mike. T.

I asked Jamie Yarbrough to see if she could figure anything out on their side. I’m hoping maybe it’s some bad data that we can get fixed. As for what’s going on in the COBOL of DARS, I don’t really have any way to dig deeper…………………..Zach.

On Empire, I found these for bldreqt.cob:

/app/dars/dars8/dars35/source/bldreqt.cob

/app/dars/dars8/dars35/source_354/bldreqt.cob

/app/dars/dars8/dars35/source_355/bldreqt.cob

/app/dars/darsdevl/dars35/source/bldreqt.cob

/app/dars/darsdevl/dars3581/source/bldreqt.cob

/app/dars/darstest/dars35/source_354/bldreqt.cob

/app/dars/darstest/dars35/source_355/bldreqt.cob

/app/dars/darstest/dars35/source_356/bldreqt.cob

/app/dars/darstest/dars3581/source/bldreqt.cob

I don't know if any of these are current for production.

Elden.

 

 

 

Aborted Module Name: KFSXFPPC.KFSX_JAVA_04 

  Date:        Day:      Time:          Resolution:

02/10/14     Mon        19:17          Restarted by Dermot.

05/27/14     Tue          19:26          !!!!!!!!!!!!!!!!!!!!!!!.

Error log and follow up comments:

 

02/10/14.

Started processing step procurementCardRouteDocumentsStep of job KFSXFPPC.procurementCardRouteDocumentsStep.12487441.12487452.00 for user kr

               Executing step: procurementCardRouteDocumentsStep

               #### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXFPPC.procurementCardRouteDocumentsStep.12487441.12487452.00-20140210-19-17-26-497.log

               *******************************************************

2014-02-10 19:17:26,572 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.kuali.rice.kew.exception.WorkflowServiceErrorException: Document Search Validation Errors [[WorkflowServiceErrorImpl: type=error.custom, message=MISCELLANEOUS has an invalid accounting percentage. Accounting lines cannot be 0%., arg1=MISCELLANEOUS has an invalid accounting percentage. Accounting lines cannot be 0%., arg2=null, children=[]]]

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

 

As per Josh, If this error ever happens again (see below), wait for 10 minutes before restarting the step, this error happened on the first nightly ru after the upgrade to 5.0.3. & is not expected to re-occur!

 

MISCELLANEOUS has an invalid accounting percentage. Accounting lines cannot be 0%., arg1=MISCELLANEOUS has an invalid accounting percentage. Accounting lines cannot be 0%.

Dermot.

 

05/27/14.

2014-05-27 21:42:06,255 [RMI TCP Connection(154)-129.82.111.82] INFO  com.rsmart.kuali.kfs.fp.batch.service.impl.ProcurementCardCreateDocumentServiceImpl :: Routing PCDO do

cument # 3547687.

2014-05-27 21:42:06,276 [RMI TCP Connection(154)-129.82.111.82] FATAL org.kuali.rice.core.framework.persistence.jta.KualiTransactionInterceptor :: Exception caught by Trans

action Interceptor, this will cause a rollback at the end of the transaction.

org.kuali.rice.kew.api.exception.InvalidActionTakenException: Document is not in a state to be routed

Dermot.

 

The first error I found matches this but at a different time: (JOB was restarted by Dermot as per josh, hence the different time)

2014-05-27 19:31:41,026 [RMI TCP Connection(31)-129.82.111.82] INFO  com.rsmart.kuali.kfs.fp.batch.service.impl.ProcurementCardCreateDocumentServiceImpl :: Routing PCDO document # 3547687.

2014-05-27 19:31:41,053 [RMI TCP Connection(31)-129.82.111.82] FATAL org.kuali.rice.core.framework.persistence.jta.KualiTransactionInterceptor :: Exception caught by Transaction Interceptor, this will cause a rollback at the end of the transaction.

org.kuali.rice.kew.api.exception.InvalidActionTakenException: Document is not in a state to be routed

It looks like the job is trying to catch WorkflowException but the error from Rice was org.kuali.rice.kew.api.action.InvalidActionTakenException which is a WorkflowRuntimeException so it wasn’t trapped (WorkflowException is not related to WorkflowRuntimeException). 

The document is in saved status so I’m not sure why it was trying to process it in the first place, maybe I’m seeing the symptom of something else?

Mike.

 

The PCard documents are created and put in saved status. There is another job (the one that is failing) that actually routes the PCard Document.

That job moves it from Saved to Enroute. The SQL in the log shows that it is looking for any PCDO doc with status of S.

Josh.

 

 

 

Aborted Module Name: ODSRKFSX.ODSRS002_01 

  Date:        Day:      Time:          Resolution:

02/10/14     Mon        23:23          Restarted by Dermot.

02/26/14     Wed        02:26          Restarted by Robin.

Error log and follow up comments:

 

02/10/14.

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_PUR_REQS_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 165

59                dbms_mview.refresh('CSUBAN.CSUS_MAJOR_DESC_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

60                dbms_mview.refresh('CSUBAN.CSUS_INTERDISCIPLIN_PROGRAM_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

61                dbms_mview.refresh('CSUBAN.CSUG_EADM_EMAIL_ADDRESS_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

162              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_PUR_REQS_ITM_T');

163              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_PUR_REQS_SRC_T');

164              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_PUR_REQS_STAT_T');

165              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_PUR_REQS_T');

166              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_FP_PRCRMNT_ACCT_LINE');

167              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'L_CSUF_FP_PRCRMNT_CARD_TRN_MT');

168              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_FP_PRCRMNT_CD_HLD_DL');

 

Found the problem:

ORA-01400: cannot insert NULL into ("CSUKFS"."CSUF_PUR_REQS_T"."REQS_STAT_CD")

I dropped and recreated the table and removed the “NOT NULL” option on the column.

I also modified the ODSRS002 file in /ais01/src/sql/temp so we don’t run the KFS mappings that have already completed.

Please restart.

Mark P.

 

02/26/14.

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_COFRS_DETAIL_T

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 156

 

Here is the error on the database side.

 

"ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

ORA-02063: preceding line from KFSPRD@KFS_KFSUSER_LOCATION"

 

Robin did re-start the job and I will watch the ODS to make sure this mappings completes.

Mark P.

 

 

 

Aborted Module Name:  KFSXAM99.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

02/11/14     Tue        01:08          Deleted by Dermot.

 

Error log and follow up comments:

 

 

KFSXAM99.KFSX_JAVA_01 / KFSXAM99_KFSX_SCHEDULE_DONE / dailyEmailStep ABORTED on AWPROD.

 

I manually requested the KFSXTOMC_TOMCAT_CYCLE & bounced kfs, this is the next step to run after the ABORTED Java step.

I also deleted KFSXTOMC_TOMCAT_CYCLE within the ABORTED chain so when the chain gets restarted in the morning it won’t kick any kfs users out of the system.

The ABORTED step is the only remaining component in the chain.

 

2014-02-11 01:08:24,265 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'enActionListEmailService': FactoryBean threw exception on object creation; nested exception is java.lang.IllegalStateException: Service must exist and no service could be located with serviceNamespace='null' and name='enActionListEmailService'

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Remote Shell errtrap_ssh parm 2 value is 1

 

 

As per Mike, I have made this java step plus the KFSXAM99.KFSX_JAVA_02 / weeklyEmailStep INACTIVE in Appman until further notice.

Dermot.

 

 

 

Aborted Module Name:   ODSRAGEN.ODSRS002_01

  Date:        Day:      Time:          Resolution:

02/11/14     Tue        03:38          Deleted by Dermot.

03/31/14     Mon       02:50          Restarted by Robin.

 

Error log and follow up comments:

02/11/14.

ERROR at line 1:

ORA-00928: missing SELECT keyword

no output from ODSRAGEN.ODSRS002_01

+ err=100

 

It was on the last process:

dbms_mview.refresh('CSUBAN.CSUS_TEST_SCORES_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

Not sure why it failed…It runs fine manually.

I also have not seen this error message before

SQL>  ('INVALID REFRESH NAME');

('INVALID REFRESH NAME')

  *

ERROR at line 1:

ORA-00928: missing SELECT keyword

 

The student refresh is done now, go ahead and abort it and let any downstream processes finish up.

Mark. P.

 

03/31/14.

ORA-12008: error in materialized view refresh path

ORA-01427: single-row subquery returns more than one row

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

 

Viewed ODS control reports. Saw no errors there. Checked error log in email from Robin. This error:

ORA-01427: single-row subquery returns more than one row

occurred in the refresh of one of the materialized views.

It is indicative of an unexpected data issue. This will have to be analyzed at a functional level tomorrow morning……………Shawn.

 

There is a data problem with 'CSUBAN.CSUS_STUDIO_ABROAD_MV' that will need to be resolved.  Vicki’s team will need to work on this.

begin

  dbms_mview.refresh('CSUBAN.CSUS_STUDIO_ABROAD_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

end;

Error report -

ORA-12008: error in materialized view refresh path

ORA-01427: single-row subquery returns more than one row

I will modify the ODSS002.sql file, put it in the temp directory and have the scheduling team restart the process so we can get as much of the data refreshed as possible. After the data issue is fixed I can manually run the CSUS_STUDIO_ABROAD_MV to update the data in ODS Prod.

Mark P.

 

 

 

Aborted Module Name:   ADMSAPPL.ADMSS484_01

  Date:        Day:      Time:          Resolution:

02/10/14     Mon        22:20          Restarted by Joleen.

 

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-12899: value too large for column "SATURN"."SARADDR"."SARADDR_STAT_CDE"

ORA-06512: at line 798

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

22:20:43 798     insert into saraddr

22:20:43 799   (saraddr_aidm,

22:20:43 800    saraddr_appl_seqno,

22:20:43 801    saraddr_pers_seqno,

22:20:43 802    saraddr_seqno,

22:20:43 803    saraddr_load_ind,

22:20:43 804    saraddr_activity_date,

22:20:43 805    saraddr_street_line1,

22:20:43 806    saraddr_street_line2,

22:20:43 807    saraddr_city,

22:20:43 808    saraddr_stat_cde,

22:20:43 809    saraddr_natn_cde,

22:20:43 810    saraddr_zip,

22:20:43 811    saraddr_lcql_cde

22:20:43 812    )

 

Kathy located the offending record and has asked Admissions to fix it. She put a temp sql out to skip the bad record and I have restarted ADMSAPPL. ADMSAPPL.ADMSS484_01 has completed.

Joleen.

 

 

Aborted Module Name:  ADMSAPPL.SARETMT-CHAIN_01  

  Date:        Day:      Time:          Resolution:

02/10/14     Mon        22:20          Restarted by Joleen.

04/10/14     Thu         22:23          Restarted by Dermot.

 

Error log and follow up comments:

 

02/10/14.

+ + awexe get_var_value subvar=#saretmt_status_12488508

saretmt_status=NONE

+ [[ NONE != SUCCESS ]]

+ print SARETMT_CHAIN subchain component(s) failed - aborting

SARETMT_CHAIN subchain component(s) failed - aborting

+ exit 100

 

Thanks for looking at this one, SARETMT-CHAIN calls a process flow called, SARETMT, and that is where the abort is. You were getting close!

Joleen.

 

04/10/14.

+ + awexe get_var_value subvar=#saretmt_status_12949539
saretmt_status=NONE
+ [[ NONE != SUCCESS ]]
+ print SARETMT_CHAIN subchain component(s) failed - aborting
SARETMT_CHAIN subchain component(s) failed - aborting
+ exit 100

 

Below is the data error from the subchain.

declare

ERROR at line 1:

ORA-12899: value too large for column "SATURN"."SARADDR"."SARADDR_STAT_CDE"

(actual: 8, maximum: 3)

ORA-06512: at line 810

 

I put a temp version of ADMSS484 in to bypass the record with bad data.  Can one of you please restart the job chain?  

Kathy.



 

 

Aborted Module Name:   AGENDYGN.AGENS006_01 

  Date:        Day:      Time:          Resolution:

02/11/14     Tue        19:01           Restarted by Joleen.

 

 

Error log and follow up comments:

ERROR at line 1:

ORA-20100: ::Hold from date must be less than or equal to hold to date::

ORA-06512: at line 1991

9:01:37 1985  exception

19:01:37 1986    when others then

19:01:37 1987      -- flush output needed for troubleshooting

19:01:37 1988      put_report_line('Error: ' || sqlerrm ||' ' || v_pidm);

19:01:37 1989      utl_file.fflush(file_handle);

19:01:37 1990      utl_file.fclose(file_handle);

19:01:37 1991      raise; -- reraise the exception

19:01:37 1992  end;

 

I’m not sure if this will be helpful? But, below is a portion of the utl file showing where processing stopped.

Joleen.

-11-FEB-2014 19:02:47                                 Colorado State University               

                                  Add holds for persons with ended mailing addresses and no hold

PIDM     HTYP From Date Reason                                                                 

---------------------------------------                                                     

10010233 DX   11-FEB-14 Address update needed                                               

10234548 DX   11-FEB-14 Address update needed  

 

11378541 DX   12-FEB-14 Address update needed                                               

11383432 DX   11-FEB-14 Address update needed                                               

11385288 DX   11-FEB-14 Address update needed                                               

11387845 DX   11-FEB-04 Address update needed                                                

Error: ORA-20100: ::Hold from date must be less than or equal to hold to date::

 

830061639           Sanchir, Ariunchimeg

 

Has an MA address with a good through date with a year of 2104 instead of 2014.

Can you have someone please remove the to date or fix it and let us know so that we can restart AGENS006 and the schedule?

02/10/2104         KCHACON           830061639           Sanchir, Ariunchimeg     11387845            

MA         2              13-JUN-13           10-FEB-04            1017 S Birch St B208

select to_char(spraddr_to_date,'MM/DD/YYYY'), spraddr_user, spriden_id, spriden_last_name || ', ' || spriden_first_name namex, a.*

  from spraddr a, spriden

  where spraddr_pidm = 11387845

  and   spraddr_atyp_code = 'MA'

  and   spriden_pidm = spraddr_pidm

  and   spriden_change_ind is null;

Vivki.

 

I’ve corrected the address to date.

Jerry Becker.

 

  

 

 

 

Aborted Module Name:   AREGDYTS.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

02/17/14     Mon       06:47           Restarted by Joleen.

 

Error log and follow up comments:

 

 

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  colora-88@iwantmytranscript.com

# > sftp> pwd

# > Remote working directory: /home/scponly/home/colora-88

# > sftp> pwd

# > Remote working directory: /home/scponly/home/colora-88

# > sftp> lpwd

# > Local working directory: /ais101jfs/jobprd

# > sftp> lls -l /ais01/bkp/AREGTTRN.AREGS621_01.12537406.XML

# > -rw-rw----    1 appworx  Gprd            487 Feb 17 06:46 /ais01/bkp/AREGTTRN.AREGS621_01.12537406.XML

# > sftp> -ls -l /home/colora-88/statuses/awaiting-process/colorado_state.20140217_064719.status.xml

# > Couldn't stat remote file: No such file or directory

# > Can't ls: "/home/colora-88/statuses/awaiting-process/colorado_state.20140217_064719.status.xml" not found

# > sftp> put /ais01/bkp/AREGTTRN.AREGS621_01.12537406.XML /home/colora-88/statuses/awaiting-process/colorado_state.20140217_064719.status.xml

# > Uploading /ais01/bkp/AREGTTRN.AREGS621_01.12537406.XML to /home/colora-88/statuses/awaiting-process/colorado_state.20140217_064719.status.xml

# > remote open("/home/colora-88/statuses/awaiting-process/colorado_state.20140217_064719.status.xml"): No such file or directory

# > (1)

#==============================================================================

# FATAL : Command failed with code : 1

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

 

AREGDYTS.SSH_SFTP_01 is complete.

 

I had to modify the SSH_SFTP path.

David.

 

 

 

Aborted Module Name: KFSXPDSA.KFSX_JAVA_01 

  Date:        Day:      Time:          Resolution:

02/25/14     Tue       09:17           Restarted by Dermot.

 

Error log and follow up comments:

 

 

KFSXPDSA.KFSX_JAVA_01 / KFSXPDSA_PDP_SEND_ACH_ADVICE / pdpSendAchAdviceNotificationsStep

 

at org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient.<clinit>(BatchJobRmiInvokerClient.java:18)

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /srv/kfs/tomcat/logs/kfs-memory.log (No such file or directory)

 

       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.lang.Thread.run(Thread.java:662)

Caused by: org.springframework.mail.MailSendException: Failed messages: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <STEPHANIE.TEUFEL@COLOSTATE.EDU>... User unknown

; message exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <STEPHANIE.TEUFEL@COLOSTATE.EDU>... User unknown

 

The correct email address is Sharon.Teufel@coloradostate.edu.

Linda.

 

update PDP_PAYEE_ACH_ACCT_T set PAYEE_EMAIL_ADDR = 'bfs_accounts_payable@mail.colostate.edu'
where upper(PAYEE_EMAIL_ADDR) = upper('STEPHANIE.TEUFEL@COLOSTATE.EDU');

update pdp_pmt_grp_t set adv_email_addr = 'bfs_accounts_payable@mail.colostate.edu'
where upper(adv_email_addr) = upper('STEPHANIE.TEUFEL@COLOSTATE.EDU');

 

John Swaro requested we change the invalid emails address to 'bfs_accounts_payable@mail.colostate.edu'

 

Got Shawn to update the following tables:

PDP_PAYEE_ACH_ACCT_T

pdp_pmt_grp_t set adv_email_addr

 

Dermot.

 

 

Aborted Module Name:  APMXPAGE.APMXPAGE-LOOP_01

  Date:        Day:      Time:          Resolution:

02/27/14     Thu       09:17           Restarted by Steve.

 

Error log and follow up comments:

 

-----Original Message-----
From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]
Sent: Thursday, February 27, 2014 11:00 AM
To: 9702184855@vtext.com; IS DL: Alert APMX; 9705815577@tmomail.net
Subject: CRITICAL AWPROD COMPONENT FAILURE

 

Thu Feb 27 10:59:30 MST 2014

**CRITICAL AWPROD COMPONENT FAILURE**

APMXPAGE

 

 

This is resolved -- the APMXPAGE.APMXPAGE-LOOP_01 aborted with :

231 Error number from open pipe 2. /appworx/pipe/AWAPI_AWPROD_PIPE

334 Check that the API server is running.

 

I restarted and it was running fine, but I noticed today’s APMXPAGE waiting to run and figured it would be less messy to let that one run instead. I killed yesterday’s APMXPAGE and today’s is now running.

Steve.

 

 

 

Aborted Module Name:   HRMSMTH2_HRMSR233_01

  Date:        Day:      Time:          Resolution:

03/03/14     Mon       21:34           Restarted by Robin.

 

Error log and follow up comments:

 

 

Program exited with status 1

Concurrent Manager encountered an error while running Oracle*Report for your concurrent request 8070690.

 

Review your concurrent request log and/or report output file for more detailed information.

+---------------------------------------------------------------------------+

Executing request completion options...

 

Output file size:

848

Finished executing request completion options.

 

+---------------------------------------------------------------------------+

Concurrent request completed

Current system time is 03-MAR-2014 21:37:43

 

+---------------------------------------------------------------------------+

     

FILEID:13587357

FILEID:13587358

Mon Mar 03 21:38:03 MST 2014:Job final status: Error(102)

Mon Mar 03 21:38:05 MST 2014                              Page 1

Generate Report-Concurrent Program Parameter(s) for this jobid

Parameter                           Value      

 

Please rerun this report.

-Bob-

 

 

 

Aborted Module Name: HRMSCPR_QPH.HRMSR204_01  

  Date:        Day:      Time:          Resolution:

03/10/14     Mon       08:42           Restarted by Robin.

 

Error log and follow up comments:

 

 

Arguments

------------

p_begin='07-mar-2014'

p_end='07-mar-2014'

p_payroll='22'

p_action_type='Q'

------------

Execution options

VERSION=2.03b

Current NLS_LANG and NLS_NUMERIC_CHARACTERS Environment Variables are :

American_America.US7ASCII

 

Enter Password:

REP-0069: Internal error

REP-57054: In-process job terminated:Finished successfully but output is voided

 

Report Builder: Release 10.1.2.3.0 - Production on Mon Mar 10 08:42:33 2014

 

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

 

+---------------------------------------------------------------------------+

Start of log messages from FND_FILE

+---------------------------------------------------------------------------+

+---------------------------------------------------------------------------+

End of log messages from FND_FILE

+---------------------------------------------------------------------------+

+---------------------------------------------------------------------------+

Program exited with status 1

Concurrent Manager encountered an error while running Oracle*Report for your concurrent request 8077903.

 

Review your concurrent request log and/or report output file for more detailed information.

 

 

Please rerun this report and let us know if it runs fine the second time through.

-Bob-

 

 

 

Aborted Module Name:  AGENDSRP.AGENS034_01

  Date:        Day:      Time:          Resolution:

03/10/14     Mon       10:05           Restarted by Steve.

 

Error log and follow up comments:

 

 

email_error_message:

Rq_id =116

email_address:

email_success: F

email_reply: *ORA ERR*-29279 ORA-29279: SMTP permanent error: 553 5.0.0 <>...

User address required

email_error_message: ORA-29279: SMTP permanent error: 553 5.0.0 <>... User address required

email_error_message:

Rq_id =117

email_address:

email_success: F

email_reply: *ORA ERR*-29279 ORA-29279: SMTP permanent error: 553 5.0.0 <>...

User address required

email_error_message: ORA-29279: SMTP permanent error: 553 5.0.0 <>... User address required

 

 

Re-started per Bill and completed Successfully.

David.

 

 

 

Aborted Module Name:   FAIDDLM2_EV.RPEDISB_16

  Date:        Day:      Time:          Resolution:

03/11/14     Tue        13:33           Restarted by Joleen.

 

Error log and follow up comments:

 

 

STARTING PROGRAM - RPEDISB - Version 8.19

PARAMETER (1)     = jobprd

PARAMETER (2)     = Password

PARAMETER (3)     = 3354934

PARAMETER (4)     = RPEDISB

   SQLCODE  = 12519

   SQLERRMC = ORA-12519: TNS:no appropriate service handler found

 

Re-started and completed. This also happened on FAIDDLM2_EV.RPEDISB_18.

Is oracle okay?

Joleen.

 

We had a whole flood of Financial Aid web users hit the system and blew out the max number of processes. 

I’m in the process of killing them off right now so hopefully this was just a glitch.

Mark. B.

 

 

 

Aborted Module Name:   AGENDYGN.AGENS033_01

  Date:        Day:      Time:          Resolution:

03/14/14     Fri         19:11           Restarted by Joleen.

 

Error log and follow up comments:

 

 

*** Start of AGENS033 03/14/2014 19:11:24 Deleted hold G6 for student ID 829453073: Song,Tao Deleted hold G6 for student ID 829578275: King,Ashlea

Error: could not delete hold G6 for student ID 829453073: Song,Tao: ORA-20100:

::Cannot delete, hold record does not exist::

Deleted hold G6 for student ID 829939146: Devaragudi,Prakruthi Deleted hold G6 for student ID 829929661: Sharif,Rabab Deleted hold G6 for student ID 829883832: Negley,Joshua .

. Summary:

.  Number of records updated: 5

.  Number of errors: 1

**** End of AGENS033 03/14/2014 19:11:26

 

I would not worry about this error.  Please let the schedule continue.

Vicki.

 

Erin reminded me that this job has an output scan to ignore this error and not abort.

Any ideas why the scan didn’t work?

Joleen

 

 

 

Aborted Module Name:  FAIDNCPP_OD.XMLSQL_01  

  Date:        Day:      Time:          Resolution:

03/18/14     Tue         05:42          Deleted by Gudrun & new chain requested in.

 

Error log and follow up comments:

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

ERROR at line 1:

ORA-00001: unique constraint (CSUBAN.RWRCSSN_PK) violated

***

 

I have attached the insert statements generated successfully from the two xml datafiles processed by the XMLLDR component of FAIDNCPP_OD.

The insert statements fail to be executed successfully because of a unique constraint data error by the XMLSQL component.

Above attached sql script is located in /ais01/src/sql/temp.

Details of two XML datafiles processed by XMLLDR to generate above insert statements:

Location:

$ cd /ais01/ftp/from/user/FAIDNCPP_input/

$ ls ncp*

ncp_data_4075_9084720.xml   ncp_data_4075_9085412.xml

$ hostname

Kebler

$ pwd

/ais01/ftp/from/user/FAIDNCPP_input

Gudrun.

 

The insert statement file that was attached to the previous email from Gudrun has identical inserts for the first record and the last record (KrystalLynne Allen).  That is what is causing the PK violation.  The student record causing the PK violation is the first XML record set in file ncp_data_4075_9084720.xml and only appears once in that file.  That student record is not in the second XML file, so the insert statement appears to be duplicated after parsing (it's not duplicated in the XML files).

I also see in the XML files on Kebler that another record in the second XML file (ncp_data_4075_9085412.xml) does NOT appear in the insert statement file (Theresa Hart).

Rob.

 

 

 

 

 

Aborted Module Name:  WHRSL030.SQLLOAD_02

  Date:        Day:      Time:          Resolution:

 03/19/14     Wed       22:17          Restarted by Joleen.

 

Error log and follow up comments:

 

 

value used for ROWS parameter changed from 500 to 261
SQL*Loader-926: OCI error while executing delete/truncate (due to REPLACE/TRUNCATE keyword) for table "CSUH"."WHRS_CUR_FY_GP_00"

ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO_SPACE'

 

We restarted the failed job and it completed.

Joleen.

 

 

 

Aborted Module Name:   ADMSSRLD_DY.SURLOAD_03

  Date:        Day:      Time:          Resolution:

03/19/14     Wed       22:33           Restarted by Joleen.

07/09/14     Wed       22:29           Restarted by Joleen.

Error log and follow up comments:

03/19/14.

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated - parent key not found

WRN-ORACERR: Error occurred in file "surload.pc" at line 705

WRN-ERRSTMT: Following statement was last statement parsed:

    insert into GLBEXTR (glbextr_key,glbextr_application,glbextr_selection

surload terminated with error

11 lines written to /appworx/out/surload_3363475.lis

 

19-MAR-2014 22:33:51                                 Colorado State University                                            PAGE 1

999999                                                  Communication Load                                                8.2

                                                       AUTOMATED GURMAIL LOAD

                                                            Load Errors

PIDM      Id                            Name                     System Indicator     Description/Comments

 

011063097 827787500 Jalal , Runeela                              S                    Not Loaded; Duplicate Non-printed letter

 

Joleen, I believe this student has an invalid email that we are trying to send to so I’ve deleted the email transaction in SUAMAIL. 

You can restart this job when you are ready.

Marcella.

 

07/09/14.

WRN-ORACERR: Error occurred in file "surload.pc" at line 705

WRN-ERRSTMT: Following statement was last statement parsed:

    insert into GLBEXTR (glbextr_key,glbextr_application,glbextr_selection

surload terminated with error

11 lines written to /appworx/out/surload_3481115.lis

 

09-JUL-2014 22:29:13                                 Colorado State University                                            PAGE 1

999999                                                  Communication Load                                                8.2

                                                       AUTOMATED GURMAIL LOAD

                                                            Load Errors

PIDM      Id                            Name                     System Indicator     Description/Comments

010998112 827332500 Sorel , Keelie C                             S                    Not Loaded; Duplicate Non-printed letter

 

 

 

Aborted Module Name:  AREGDRDY.AREGS707_01  

  Date:        Day:      Time:          Resolution:

03/19/14     Wed       20:17           Restarted by Joleen.

 

Error log and follow up comments:

 

 

*** SEARCH OF STDOUT FOR SQL ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR at line 1:

ORA-01400: cannot insert NULL into ("GENERAL"."GLBEXTR"."GLBEXTR_KEY")

ORA-06512: at line 66

***

 

I think we have a logic error going on with AREGS707. Although I'm not quite sure why. I believe 707 is used to find the list of students that need to be emailed because we processed a transfer evaluation the day before and delete the list from the prior day. I'm not sure why it is trying to insert a null value.

I have copied Zach. Maybe he can help check if the query to populate GLBEXTR is relying on darwin tables rather than uachieve tables?

 

Ok - I think I see the issue.

Zach - it is a date issue. Last_mod_date is being used in the query. It is now a date field. Looks like we are comparing against a character field. See lines 55 and 56 and maybe more.

 

20:17:08  53      where a.int_seq_no = b.stu_mast_no

20:17:08  54      and b.certify = 'S'

20:17:08  55      and (b.last_mod_date > to_date(to_char((vdate -1), 'mm/dd/yyyy') || ' 23:59:59', 'mm/dd/yyyy hh24:mi:ss')

20:17:08  56           and b.last_mod_date < trunc(vdate) + 1)

20:17:08  57      and csus_f_is_internatl(a.pidm) = 'Y'

20:17:08  58      and exists (select 'Y'

Jamie.

 

When I run the inner query (the select from stu_master, stu_demo), it works fine. I think last_mod_date was always a date, and if you look closely, it's actually taking a date, converting it to a string with a particular format, and converting it back to a date to compare to last_mod_date. I still think the only way we could be getting the " ORA-01400: cannot insert NULL into ("GENERAL"."GLBEXTR"."GLBEXTR_KEY")" message is that the pidm field on one or more stu_master records being pulled in is null.

Zach.

 

I just checked my query and I think we have all the records in u.achieve with pidms.

Can we see if we can get the job to run to completion now?

Jamie.

 

 

 

Aborted Module Name:  KFSXGLPO_D2.KFSX_JAVA_02

  Date:        Day:      Time:          Resolution:

03/21/14     Thu         20:17           Restarted by Gudrun.

 

Error log and follow up comments:

 

03/21/2014 00:57    GUDRUNK

Researched aborted job KFSXGLPO_D2.KFSX_JAVA_02.

Restarted job after creating file gl_sortpost.restart.data in originEntry.

 

Details: Job failed to be started by AppMan. Restart requires file copy from bkp to be reinstated with KFSX Support staff confirming content. In Dermot's absence called Josh. Left message. Given that logon to Kuali app is down until resolved concluded that it is ok to copy file given that NO log file exists in dir /ais02/app/kfs/prd/work/staging/gl/originEntry for posterEntriesStep.

Indication that job never started processing on the prod kuali server and save to restart.

Pressed to restart because tomcat is not being started up in next process flow until error is resolved.

 

 

 

Aborted Module Name: AREGDRGC_SP.SPOOL_TO_PRINT_01   

  Date:        Day:      Time:          Resolution:

03/21/14     Thu         00:09           See note from Elden below.

 

 

Error log and follow up comments:

 

 

AREGDRGC_SP.SPOOL_TO_PRINT_01 is in ABORTED status. This is the first time this process has run in prod with the new u.achieve I must have missed something when putting this together because the file does not exist. Also, it doesn't look like this condition worked on DARS CONTRACTS. I don't see anything in bkp {#logrunhost}; cp {#workdat}/{module}.{chain_id}.DAT {#bkp}/{module}.{#aw_now_yyyy_mm_dd_time}.BKP

 

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cp: /ais01/dat/work/prod/AREGDRGC_SP.DARS_CONTRACTS_01.12781573.DAT: A file or directory in the path name does not exist.

 

It looks like AREGS708 reported errors:

21-MAR-2014 00:07:53                                 Colorado State University                                            Page: 1

.                                                   DARS Grad Contract Requests                                           AREGS705

No Term found in STVTERM for

Transactions Processed= 0

DARS Primary Majors Added= 0

DARS Secondary Majors Added= 0

DARS Minors Added= 0

Total Errors =  1

 

The input file to DARS_CONTRACTS is empty:

  /ais01/dat/work/prod/AREGDRGC_SP.AREGS708_01.12781573.DAT

So the output file

  /ais01/spool/out/AREGDRGC_SP.AREGS703_01.12781573.txt

is also empty.

Elden.

 

 

 

Aborted Module Name:  ADMSBSLT_BAN_FEED_TO_SLATE

  Date:        Day:      Time:          Resolution:

03/21/14     Thu         22:29           Restarted by Joleen.

 

Error log and follow up comments:

 

*** Follow-up Required -- ADMSBSLT_BAN_FEED_TO_SLATE has failed       

Admissions:

Please reply to this email with one of the following options:

_ _ Delete the ABORTED LYNX job

       Admissions has manually reran it

or

_ _ Restart the ABORTED LYNX job

 

The error for this job is: Unable to connect to the remote server. I have attached the output.

 

According to this email message it looks like we have a couple options.

 

We could maybe try restarting but we usually don't do that without the end user's OK.

We could delete the job and ask Admissions to run the job manually on Monday or I could submit this job again on Monday.

 

This job takes about a minute to run. It is the only job left in the Admissions schedule.

Joleen.

 

Hi Trish and Erica

Do you want us to restart ADMSBSLT Banner to Slate this morning and try again?  Or would you prefer to run manually?

Vicki.

 

Erica contacted me yesterday and said we could let the job run this morning. I restarted and it has finished.

Joleen.

 

 

 

 

Aborted Module Name:  AREGDYCR.AREGS304_01

  Date:        Day:      Time:          Resolution:

04/01/14     Tue         03:36           Restarted by Joleen.

 

Error log and follow up comments:

 

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 714

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 805

***

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

 

03:36:32 713        BEGIN

03:36:32 714          SELECT 'First Generation'

03:36:32 715            INTO v_first_gen

03:36:32 716            FROM swrgpcd g

03:36:32 717           WHERE g.swrgpcd_gpcd_code = 'FIRSTGEN'

03:36:32 718             AND g.swrgpcd_pidm = main_rec.pidm;

03:36:32 719        EXCEPTION

03:36:32 720          WHEN NO_DATA_FOUND

03:36:32 721          THEN

03:36:32 722            v_first_gen   := 'Not First Generation';

03:36:32 723        END;

 

The problem is with the following person having multiple FIRSTGEN data codes

11328308             829643151           Schroder              Derek    Wayne

 

Can someone remove one of the FIRSTGEN rows and let us know please.

11328308             8020       FIRSTGEN                            12-OCT-11          admis_web        12-OCT-11           admis_web       

11328308             8020       FIRSTGEN            201490 24-MAR-14          JANALLEN           24-MAR-14         JANALLEN                BANNER_FORMS

Vicki.

 

I removed the FIRSTGEN code from his 2012 app

Do I need to remove it from his current 201490 app?

Janet Allen.

 

Thanks for fixing the problem by removing one of the 2 FIRSTGEN records.

We should be able to restart AREGS304.

Vicki.

 

 

 

Aborted Module Name:  KFSXPDSA.KFSX_JAVA_01   

  Date:        Day:      Time:          Resolution:

04/01/14     Tue         09:14           Restarted by Dermot.

 

Error log and follow up comments:

 

One needs to go to Kebler to find correct error, appman logs do not report correct error.

KFSXPDSA.KFSX_JAVA_01 / KFSXPDSA_PDP_SEND_ACH_ADVICE /

 

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /srv/kfs/tomcat/logs/kfs-memory.log (No such file or directory)

 

cd /ais02/app/kfs/prd/logs

ls -ltr KFSXPDSA*

vi KFSXPDSA.pdpSendAchAdviceNotificationsStep.12868852.12868890.00-20140401-09-14-49-639.log

shift & g brings you to the bottom of the file, page up to the error

 

 

Caused by: org.springframework.mail.MailSendException: Failed messages: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Bonnie.Funk@colostate.edu>... User unknown

; message exceptions (1) are:

Failed message 1: javax.mail.SendFailedException: Invalid Addresses;

  nested exception is:

        com.sun.mail.smtp.SMTPAddressFailedException: 550 5.1.1 <Bonnie.Funk@colostate.edu>... User unknown

 

hit escape then

:q! to exit without saving.

 

 Request TD ticket for DBA to update.

 

update PDP_PAYEE_ACH_ACCT_T set PAYEE_EMAIL_ADDR = 'john.swaro@colostate.edu'

where upper(PAYEE_EMAIL_ADDR) = upper('Bonnie.Funk@colostate.edu');

 

update pdp_pmt_grp_t set adv_email_addr = 'john.swaro@colostate.edu'

where upper(adv_email_addr) = upper('Bonnie.Funk@colostate.edu');

 

restart job.

 

 

 

 

 

Aborted Module Name: AREGDRGC_SP.AREGS705_01

  Date:        Day:      Time:          Resolution:

04/04/14     Fri          00:05           Restarted by Joleen.

 

Error log and follow up comments:

 

 

ORA-06502: PL/SQL: numeric or value error: character string buffer too small

ORA-06512: at line 299

 

*** END SEARCH OF LOG FOR SQL ERROR STRINGS

no output from AREGDRGC_SP.AREGS705_01

+ err=100

 

0:05:46 293 

00:05:46 294          ---- format attr4 in first mi last

00:05:46 295          v_pos1      := instr(v_name, ',');                /* find comma position */

00:05:46 296          v_pos1      := v_pos1 - 1;                        /* remove comma */

00:05:46 297          v_last      := substr(v_name, 1, v_pos1);         /* collect last name */

00:05:46 298          v_pos1      := v_pos1 + 2;                        /* pos to beginning of first name */

00:05:46 299          v_part_name := ltrim(substr(v_name, v_pos1, 60)); /* get everything after the last name*/

00:05:46 300                                                            /* and remove any left spaces */

00:05:46 301          v_pos2 := instr(v_part_name, ' ', 1);             /* find pos of next space in part name*/

00:05:46 302          IF v_pos2 = 0

00:05:46 303          THEN                                              /* set length to 35 if no spaces exist */

00:05:46 304            v_pos2 := 35;                                   /* i.e. name goes clear to end */

00:05:46 305          END IF;

 

I think the reason that AREGS705 – Create DARS Graduation Contract Requests failed is because there is no SHBDIPL – Diploma Name record for the following person.  Please add this record and we will try running again.

829647309

11328912

Todd,Kazuo Scott Kuika'aleopelepohakalani Leithead

Vicki.

 

Unfortunately, the job aborted again with the same error L

Joleen.

 

Okay I think I may know the problem

00:05:46  43      v_last         spriden.spriden_last_name%TYPE;

00:05:46  44      v_first        spriden.spriden_first_name%TYPE;

00:05:46  45      v_mid          spriden.spriden_mi%TYPE;

00:05:46  46      v_part_name    VARCHAR2(35);

Zach.

 

Can you please change v_part_name to be like spriden.spriden_first_name%TYPE instead of varchar2(35).

I will  create a team dynamics ticket for this.

Vicki.

 

I’ve made this change and put the updated version in /ais01/src/sql/temp.

Zach.

 

 

 

 

Aborted Module Name:   CLMSDATA.SSH_EXEC_01

  Date:        Day:      Time:          Resolution:

04/05/14     Sat          08:36           Restarted by Joleen.

04/08/14     Tue        10:52           Manually ran by Steven Dove.

 

Error log and follow up comments:

04/05/14.    

# 2014.04.05-08:44:06 : >    Source: Update Demo Info

# 2014.04.05-08:44:06 : >    Description: Implicit conversion from data type varchar to varbinary is not allowed. Use the CONVERT function to run this query.

 

# 2014.04.05-08:44:06 : >    Description: Executing the query "exec ckelly.spCSUUpdateDemosFromBanner" failed with the following error: "Invalid column name 'SS#'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

 

# 2014.04.05-08:44:06 : >    Description: Executing the query "exec ckelly.spCSUUpdateDemosFromBanner" failed with the following error: "Invalid column name 'SS#'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.

 

# 2014.04.05-08:44:06 : >    Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED.  The Execution method succeeded, but the number of errors raised (3) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.

 

I think I’ve found and corrected the problem in CLM.  Can you restart the job?

~Steven Dove.

 

04/08/14.   

# 2014.04.08-10:52:39 : >    Description: There was an error with output column "ssn" (521) on output "OLE DB Source Output" (11). The column status returned was: "Text was truncated or one or more characters had no match in the target code page.".

 

# 2014.04.08-10:52:39 : >    Description: The "output column "ssn" (521)" failed because truncation occurred, and the truncation row disposition on "output column "ssn" (521)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

 

code 0xC020902A.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.

 

# 2014.04.08-10:52:39 : >    Description: SSIS Error Code DTS_E_THREADFAILED.  Thread "WorkThread0" has exited with error code 0xC0047039.  There may be error messages posted before this with more information on why the thread has exited.

 

There is a problem within the DTSX package that is being called.  We need to rebuild it and I will get back to you when we are ready to restart the process.

~Steven Dove.

 

Steven will run CLMSDATA manually until a fix is in place.

Joleen.

 

 

 

 

 

 

 

Aborted Module Name:  AGENDSRP.AGENS035_01

  Date:        Day:      Time:          Resolution:

04/07/14     Sat          10:05           Deleted by David.

04/21/14     Mon        10:05           Deleted by David.

 

Error log and follow up comments:

 

04/07/14.

email_reply: *ORA ERR*-29279 ORA-29279: SMTP permanent error: 553 5.0.0 <>...                                  

User address required                                                                         

email_error_message: ORA-29279: SMTP permanent error: 553 5.0.0 <>... User                   

address required                                                                             

email_address: Travis.Bailey@ColoState.EDU                                                   

email_success: T                                                                             

email_reply: Mail Sent                                                                        

email_error_message:                                                                         

email_address: Linda.Meserve@ColoState.EDU                                                   

email_success: T                                                                             

email_reply: Mail Sent                                                                       

email_error_message:                                                                          

email_address: Mercedes.Gonzalez-Juarrero@ColoState.EDU                                      

email_success: T                                                                             

email_reply: Mail Sent                                                                                                 

 

Deleted AGENS035 per Bill.

David.

 

04/21/14.

email_address: Travis.Bailey@ColoState.EDU

email_success: T

email_reply: Mail Sent

email_error_message:

email_address:

email_success: F

email_reply: *ORA ERR*-29279 ORA-29279: SMTP permanent error: 553 5.0.0 <>...

User address required

email_error_message: ORA-29279: SMTP permanent error: 553 5.0.0 <>... User address required

email_address: Heather.Foster@colostate.edu

email_success: T

email_reply: Mail Sent

email_error_message:

email_address: Keith.Wilson@ColoState.EDU

email_success: T

email_reply: Mail Sent

 

Deleted AGENDSRP.AGENS035_01 per Bill G.

David.

 

 

 

Aborted Module Name:   AREGDRDY.AREGS707_01  

  Date:        Day:      Time:          Resolution:

04/08/14     Tue       20:10          Restarted by Joleen.

 

Error log and follow up comments:

 

 

ERROR at line 1:

ORA-01400: cannot insert NULL into ("GENERAL"."GLBEXTR"."GLBEXTR_KEY")

ORA-06512: at line 66

 

20:10:04  65          --Insert for non international students

20:10:04  66          insert into glbextr

20:10:04  67          (glbextr_application,

20:10:04  68     glbextr_selection,

20:10:04  69     glbextr_creator_id,

20:10:04  70     glbextr_user_id,

20:10:04  71     glbextr_key,

20:10:04  72     glbextr_activity_date,

20:10:04  73     glbextr_sys_ind,

20:10:04  74     glbextr_slct_ind)

20:10:04  75     (select distinct

20:10:04  76             'REGISTRAR',           --glbextr_application

20:10:04  77             'TRNS_EVAL_COMP_DOM',  --glbextr_selection

20:10:04  78             'REGUSER',             --glbextr_creator_id

20:10:04  79             'JOBPRD',              --glbextr_user_id

20:10:04  80             a.pidm,                --glbextr_key

20:10:04  81             sysdate,               --glbextr_activity_date

20:10:04  82             'S',                   --glbextr_sys_ind

20:10:04  83             null                   --glbextr_slct_ind

20:10:04  84      from stu_master a,

20:10:04  85           stu_evalgrp b

 

Katie, I think this job aborted because of some bad data. Could you guys clean it up on your side? It appears to be test student data that was in with the real data. The student number is BH-ENVZ-BS, and the job aborted because it has a null pidm (which makes sense because it’s not a real person). There is some evaluated transfer work for this student and that’s why this error is occurring. I’m guessing it was just Brenda doing some testing, but we’ll need to figure out how to keep these testing scenarios out. We’ll have to figure out if some future adjustment is needed.

-Zach.

 

We have cleared this up.  Brenda was testing the cloning and now we know if they have transfer work we should delete that.

Katie.

 

 

 

 

Aborted Module Name:   FAIDAWNT_OD.LYNX_02

  Date:        Day:      Time:          Resolution:

04/11/14     Fri          00:37          Restarted by David.

 

Error log and follow up comments:

 

   URL=http://wsnet.colostate.edu/cwis231/autorun/award_letters.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

 

For this FAID job, you will find the error message here:

 111 /appworx/out/LYNX_12949689.00.stdout.txt

The error is in the standard out (stdout) file.

We don't have access to this directory, I have to run copy_joblog to get the error. Steve should have access.

You will want to send this error to Mike Berry. He has been working with this job.

Joleen.

 

FAIDAWNT_OD LYNX_02 step failed last night, is holding up the entire schedule, attached the output. Please let us know if this is okay to restart ASAP.

David.

 

I’m looking into this issue now. I have two files in our directory, over here, but I’m not seeing anyone in the .AXO.doc file. Candy told me you restarted it and it failed again. I deleted the two files and kicked the job off from my end. I got the same error, “No such file”, but I got the same two files. The AXO.doc is still empty, which shouldn’t have any bearing on this problem. However, it makes me think there is another problem in addition to the one I am presently working. I have a little bit of code that lets me see what files are in the to_faid directory and it shows me this.

/data/FAIDBDMS/to_faid/.  /data/FAIDBDMS/to_faid/..  /data/FAIDBDMS/to_faid/FAIDAWNT_OD.GLRLETR_01.12949675.APO.doc
/data/FAIDBDMS/to_faid/FAIDAWNT_OD.GLRLETR_02.12949675.AXO.doc

I’m somewhat mystified by the first two lines. But I don’t see how they could  be a problem anyway.

So there’s all that. An idea just popped into my head, so I’ll give it a try and let you know if I have any luck……………..Mike B.

 

I’m showing both files have data in the to_faid directory. Thanks for your help!

David.

 

Very weird. It is definitely objecting to the axo file. I’ve turned off that piece and it sent 4 APO.pdf files to the from_faid directory. However (insert additional weirdness here), there also 3 revision letter pdf files (RVIMO) from 4/8 and 2 from 4/9. And even more weirdness, I have the RVIMO.doc file from 4/10, but there are no pdfs in from_faid for that date. I gave Candy the first and last csuid from all 3 rvimo.doc files and she says all but the last person on 4/9 are indexed. So apparently, those RVIMO….pdf files can be deleted. I save the files generated by the jobs, so we will look at other people in the file from 4/9 and see if any others haven’t been indexed and deal with that later.

I would say, at this point, you should delete the RVIMO…pdf files in the from_faid directory, start the job at the beginning of the indexing part. Can you email me a copy of the AXO.doc file in the to_faid directory? We probably don’t want to delete that one just yet, so maybe stop the job before it delets files from to_faid.  Whew! This isn’t exactly how I wanted to start my Friday……………..Mike B.

 

I deleted the RVIMO pdf files from the from_faid directory and let the FAIDAWNT_OD job continue to the indexing step for the APO pdfs. I have attached the AXO doc file. I also have a copy of this file in my directory and our spool directory. I will allow the to_faid directory to be cleaned up for now since I have copies.

I re-started the FAIDAWNT_OD process flow starting at the BDMS indexing piece to process the AXO documents. This has now completed.

David.

 

 

 

 

Aborted Module Name:   TDCLIENT_SEND

  Date:        Day:      Time:          Resolution:

04/10/14     Thu         21:03          Restarted by David.

 

Error log and follow up comments:

 

 

# 20140410-210738 : pipe_exec                 | cmdout = <debug1: Exit status 107

> 

# 20140410-210738 : *** FATAL ***main::check_status | SEND_TO_CMD (close) [0] failed to execute (107)

# 20140410-210738 : *** FATAL ***main::check_status | (100)

# 20140410-210738 : *** FATAL ***main::check_status | #****************************************************************************************************

# 20140410-210738 : *** FATAL ***main::check_status | CMDOUT (close) [5570572] failed to execute (100)

# 20140410-210738 : *** FATAL ***main::check_status | (100)

# 20140410-210738 : *** FATAL ***main::check_status | #****************************************************************************************************

+ exit 100

 

 

 

Aborted Module Name:  ADMSODSL.LYNX_01  

  Date:        Day:      Time:          Resolution:

04/11/14     Fri          03:12          Restarted by David.

 

Error log and follow up comments:

 

#== Entering /appworx/csu/exec/LYNX.KSH [https://wsnet.colostate.edu/ai/appworx/bulkcopy.aspx /appworx/out/LYNX_12949590.00.stdout.txt /appworx/out/ADMSODSL.LYNX_01.12949590.00.stderr.txt /appworx/out/ADMSODSL.LYNX_01.12949590.00.status.txt STATUS=.*OK] ============

PARSE

SCAN STATUS REGEX = [STATUS=.*OK]

wc: 0653-755 Cannot open /appworx/out/ADMSODSL.LYNX_01.12949590.00.status.txt.

       0 /appworx/out/LYNX_12949590.00.stdout.txt

      15 /appworx/out/ADMSODSL.LYNX_01.12949590.00.stderr.txt

      15 total

*** /appworx/out/ADMSODSL.LYNX_01.12949590.00.stderr.txt ***

 

 

ADMSODSL.LYNX_01 could not connect. I re-started it with Kathy B’s approval.

 

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Secure 128-bit TLSv1/SSLv3 (AES128-SHA) HTTP connection

Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Retrying connection.

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Alert!: Unable to make secure connection to remote host.

 

lynx: Can't access startfile https://wsnet.colostate.edu/ai/appworx/bulkcopy.aspx

David.

 

ADMSODSL.LYNX_01 aborted. ADMSAM99 will complete once resolved.  Please decide if you need to get this process flow completed otherwise we assume it can get resolved on Monday after the upgrade. Only one LYNX step needed; however,  Kathy B and/or Marcella need to be contacted prior any delete or reset of the job on our part.

Gudrun.

 

LYNX error: SHUTDOWN is in progress.

Login failed for user 'cwis54'. Only administrators may connect at this time.

Joleen.

 

 

 

 

Aborted Module Name:   FAIDSNTD.WAIT_FOR_CHAINS_01

  Date:        Day:      Time:          Resolution:

04/11/14     Fri          12:43           Restarted by Dermot.

 

Error log and follow up comments:

 

 

+ 1> /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat

+ [[ -s /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.chain_prefix.dat ]]

+ print *** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

*** ERROR:  NO CHAIN MODULES FOUND FOR CHAIN

+ cat /ais01/dat/work/prod/FAIDSNTD.WAIT_FOR_CHAINS_01.jq.dat

     12950154.00 FAID      FAIDPLOD_PELL_ORIG_D04/11 12:42 FINISHED    AWPROD

+ exit 1

 

Is it okay to restart this ABORTED job?

Dermot.

 

Did you delete or restart. Either way fine.

Gudrun.

 

Updated the #FAIDSNTD_EXCLUDE_DATE with today’s date & restarted, finished okay.

Guess I should have looked at the notes before sending out the ABORT error.

Dermot.

 

 

 

Aborted Module Name:   ODSRSALX.ODSRS002_01

  Date:        Day:      Time:          Resolution:

04/13/14     Sun         12:45           Restarted by Dawn.

 

Error log and follow up comments:

 

 

+ /appworx/exec/FILESIZE ODSRSALX.ODSRS002_01.12971048.12971050.00.2014_04_13_1245.jobout 100

no output from ODSRSALX.ODSRS002_01

+ err=100

 

ERROR at line 1:

ORA-06512: at "OWBSYS.WB_RT_API_EXEC", line 759

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 41

ORA-06512: at line 319

 

Can you please attach the whole log file…that error is meaningless to me.

Mark. B.

 

 

04/13/2014 15:27    MBRITTON

The ods staging job got hung up on some resource contention which I think also caused the appman job Dawn mentioned to fail.  Gathered optimizer stats on a few tables and restarted the ODS staging load job.  New ETA for that job is 6 pm.  Still waiting to hear back from Dawn if the salx job problem is resolved.

 

Edit: The salx job finished successfully so the problem is resolved.

 

 

 

 

 

Aborted Module Name:  AREGFQTR.SSH_SFTP_01

 

  Date:        Day:      Time:          Resolution:

04/13/14     Sun         19:45           Restarted by Joleen.

06/03/14     Tue         12:21           Restarted by David.

 

 

Error log and follow up comments:

 

04/13/14.

# - sftp

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  nfjz3pm@in.escrip-safe.com

# > ssh: connect to host in.escrip-safe.com port 22: Connection refused # > Connection closed # > (255)

 

 

06/03/14.   

# - sftp

#   COMMAND        : /usr/bin/sftp  -b- -oIdentityFile="/home/jobprd/.ssh/csu_to_escrip_safe-4096-20111109"  colora-88@iwantmytranscript.com

# > Permission denied (publickey).

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

I just tried again from the kebler command line but same result.

 

$   sftp -oIdentityFile=/home/jobprd/.ssh/csu_to_slate-4096-20130718 csu@connect.colostate.edu@ft.technolutions.net

Permission denied (publickey).

Connection closed

$ hostname

Kebler

$ whoami

Jobprd

Gudrun.

 

Matt is going to contact the transcript folks.

David.

 

Tried calling our two tech representatives and then the general contact number but all went to voice mail which is odd for them - must be something going on there that they're all working on.  Left a voice message for one, and emailed the other two that they need to check on this and get back to us...

As soon as I hear something I'll let you know,

Matt.

 

 

 

 

 

Aborted Module Name:   AREGDYWL_SM.AREGS421_01

  Date:        Day:      Time:          Resolution:

04/13/14     Sun         23:23           Restarted by Joleen.

 

Error log and follow up comments:

 

 

ERROR at line 147:

ORA-06550: line 147, column 17:

PL/SQL: ORA-00913: too many values

ORA-06550: line 147, column 5:

PL/SQL: SQL Statement ignored

 

23:33:23 145    begin

23:33:23 146 

23:33:23 147      insert into swrwait

23:33:23 148      select c.* , sfkwlat.f_get_wl_pos(c.sfrstcr_pidm, c.sfrstcr_term_code, c.sfrstcr_crn)

23:33:23 149      from sfrstcr c

23:33:23 150      where c.sfrstcr_term_code = p_term_code

23:33:23 151      and   c.sfrstcr_crn       = p_crn

23:33:23 152      and   c.sfrstcr_ptrm_code = p_PTRM_CODE

23:33:23 153      and   c.sfrstcr_rsts_code = 'WL';

23:33:23 154 

23:33:23 155    Exception

23:33:23 156      When No_Data_Found then

23:33:23 157        null;

23:33:23 158 

23:33:23 159    end p_insert_swrwait;

 

 

 

Aborted Module Name:   HRMSL001.SEND_MAIL_02

  Date:        Day:      Time:          Resolution:

04/16/14     Wed        03:11           Restarted by Steve.

 

Error log and follow up comments:

 

 

Wed Apr 16 03:35:24 MDT 2014                                                                                                     Page 1

                                            Check Backlog for ABORTED jobs (so_status  202)                                           

Job                   Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

--------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

HRMSL001.SEND_MAIL_02 12988914 04-16-2014 03:11:10 MDT    202 ABORTED            20728.57                   1451                      7

 

 

Fixed this one.  HRMSL001.HRMSS042_01.DAT was missing from /ais01/dat/work/prod.  Job uses utl_file1 for this .DAT file, so I found a copy in ais01/bkp, renamed it, copied to /ais01/dat/work/prod and restarted.

Stevie G.

 

It looks like HRMSL001 runs the SEND_MAIL_02 at the same time as the CHAIN_FINISH. Since the CHAIN_FINISH cleans up work files, the CHAIN_FINISH should probably be dependent on the SEND_MAIL_02

David. P.

 

I agree.  I made the change.

Steve.

                               

 

 

Aborted Module Name:  ADMSAPPL.XMLSQL_01

  Date:        Day:      Time:          Resolution:

04/16/14     Wed        22:16           Continued after ABORT.

04/21/14     Mon        22:30           Continued after ABORT.

 

Error log and follow up comments:

04/16/14.

ADMSAPPL.XMLSQL_01 is CONTINUING After ABORT

Elapsed: 00:00:00.01

INSERT INTO CSUBAN.CSUS_ADMS_SALESFORCE_APP(APP_ID, SALESFORCE_ID, INTO_CENTER_NAME, APP_CREATE_DATE, APP_MODIFIED, APP_SOURCE, FIRST_NAME, LAST_NAME, ADDRESS_TYPE, CURRENT_ADDRESS_1, CURRENT_ADDRESS_2, CURRENT_CITY, CURRENT_ZIP, CURRENT_COUNTRY, GENDER, DATE_OF_BIRTH, BIRTH_CITY, BIRTH_COUNTRY, PHONE_TYPE, CITIZENSHIP_TYPE, CITIZENSHIP_COUNTRY, ACADEMIC_DISC_ACTION, CRIM_LEGAL_ACTION, ENTRY_TERM, ADMISSION_TYPE, ADMISSION_NATION, STUDENT_TYPE, RESIDENCY, STUDENT_LEVEL, CAMPUS, ATTRIBUTE_1, ATTRIBUTE_2, SORLCUR_PROGRAM, COURSE_OWNER, COURSE_CODE, AGENT_URN, COURSE_REF_NO, COURSE_NAME, PROGRAM_GROUP, DECISION_VALUE, LATEST_DECN_DATE, CREATE_DATE, CREATE_USER ) VALUES ('IN:E0336857Q', '0013000001Gd9QHAAZ', 'INTO Colorado State University', TO_DATE('04/10/2014 11:00:38','MM/DD/YY HH24:MI:SS'), TO_DATE('04/16/2014 11:35:09','MM/DD/YY HH24:MI:SS'), 'INTO-AGENT', 'Dinesh', 'Tarigopula', 'MA', 'M-5-6-7 City Light Complex, Opp Science Center', 'Near Petrol Pump, City Light', 'Surat', '395007', 'IN', 'Male', TO_DATE('06/16/1993','MM/DD/YY HH24:MI:SS'), 'Edumudi Village, Prakasam Dist', 'IN', 'MA', 'Non-Citizen', 'IN', 'No', 'No', '201490', 'IN', 'IN', 'E', 'N', 'GP', 'M', 'DN2R', 'CL51', 'N2EG-CIVX-GR', 'INTO', 'INTO', 'IN0074', 'C-188415', 'Graduate Pathway-Civil Engineering', 'Graduate Pathway', 'Provisional', TO_DATE('04/16/2014','MM/DD/YY HH24:MI:SS'), TO_DATE('04/16/14 22:15:50','MM/DD/YY HH24:MI:SS'), 'JOBPRD' )

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    * ERROR at line 1:

ORA-12899: value too large for column

"CSUBAN"."CSUS_ADMS_SALESFORCE_APP"."BIRTH_CITY" (actual: 30, maximum: 20)

 

04/21/14.

INSERT INTO CSUBAN.CSUS_ADMS_SALESFORCE_APP(APP_ID, SALESFORCE_ID, INTO_CENTER_NAME, APP_CREATE_DATE, APP_MODIFIED, APP_SOURCE, FIRST_NAME, LAST_NAME, MIDDLE_NAME, ADDRESS_TYPE, CURRENT_ADDRESS_1, CURRENT_ADDRESS_2, CURRENT_CITY, CURRENT_COUNTRY, GENDER, DATE_OF_BIRTH, BIRTH_CITY, BIRTH_COUNTRY, PHONE_TYPE, CITIZENSHIP_TYPE, CITIZENSHIP_COUNTRY, ACADEMIC_DISC_ACTION, CRIM_LEGAL_ACTION, ENTRY_TERM, ADMISSION_TYPE, ADMISSION_NATION, STUDENT_TYPE, RESIDENCY, STUDENT_LEVEL, CAMPUS, ATTRIBUTE_1, ATTRIBUTE_2, SORLCUR_PROGRAM, COURSE_OWNER, COURSE_CODE, AGENT_URN, COURSE_REF_NO, COURSE_NAME, PROGRAM_GROUP, DECISION_VALUE, LATEST_DECN_DATE, CREATE_DATE, CREATE_USER ) VALUES ('IN:E0337573Q', '0013000001GdAiiAAF', 'INTO Colorado State University', TO_DATE('04/15/2014 12:19:07','MM/DD/YY HH24:MI:SS'), TO_DATE('04/21/2014 15:24:14','MM/DD/YY HH24:MI:SS'), 'INTO-AGENT', 'Sri', 'Chekuri', 'Krishna Chaitanya Varma', 'MA', 'Flat No 1201, Strila Towers', 'Hydra', 'Hyderabad', 'IN', 'Male', TO_DATE('06/03/1993','MM/DD/YY HH24:MI:SS'), 'Akividu, Andhra Prade', 'IN', 'MA', 'Non-Citizen', 'IN', 'No', 'No', '201490', 'IN', 'IN', 'E', 'N', 'GP', 'M', 'DN2R', 'CL51', 'N2EG-EACX-GR', 'INTO', 'INTO', 'IN0019', 'C-189616', 'Graduate Pathway-Electrical and Computer Engineering', 'Graduate Pathway', 'Provisional', TO_DATE('04/21/2014','MM/DD/YY HH24:MI:SS'), TO_DATE('04/21/14 22:29:55','MM/DD/YY HH24:MI:SS'), 'JOBPRD' )

* ERROR at line 1:

ORA-12899: value too large for column

"CSUBAN"."CSUS_ADMS_SALESFORCE_APP"."BIRTH_CITY" (actual: 21, maximum: 20)

 

 

 

 

Aborted Module Name:  AREGORCH_AR.SFRNSLC_01 - AWUPGD

  Date:        Day:      Time:          Resolution:

04/21/14     Mon        11:30         See follow up below.

 

Error log and follow up comments:

 

I’m running a test for Peter on AWUPGD and I got this error for the aborted AREGORCH_AR.SFRNSLC job:

/appworxupg/out/sfrnslc_3305053.shl: sfrnslc: cannot execute. I have run this job before on AWUPGD. Any ideas?

Joleen.

 

I think we have permissions issue. The world permissions should be r-x in /app/sct/banupgd/general/exe?

 

Empire exe% ls -l sfrn*

-rwxrwxr-x    1 banner   staff        149798 Aug 04 2013  sfrnowd*

-rwxrwx---    1 banner   staff        247712 Apr 21 08:29 sfrnslc*

-rwxrwxr-x    1 banner   staff        216658 Oct 16 2010  sfrnslc.20110712*

-rwxrwxr-x    1 banner   staff        245664 Jan 17 10:37 sfrnslc.orig*

David.

 

For the sfrnslc.pc program (and aregc001.pc) , does the compile script also set file permissions?

Peter.

 

No it doesn’t…it’s not usually a problem, they only have owner and group read/exe permissions in production but the group is different.  I will make them match.  We used to use the staff group to give developers access but we’ve since moved to just opening the world permissions up.  I will fix that right now.  Give it 5 minutes or so and try again. 

 

I changed the group to staff which the App Man unix account is a member of so it should work now.  The world privileges are just set that way to allow developers to see all the source files.  The executables are just that way because it was easier to just change the permissions recursively at the top directory.  At any rate, it should work through appman now.  Let me know if you still are having problems.

Mark.

 

 

 

 

 

 

Aborted Module Name:  FAIDAWNT_OD.LYNX_02

  Date:        Day:      Time:          Resolution:

04/23/14     Wed        07:11         See follow up below.

 

Error log and follow up comments:

 

 

Error:

Object reference not set to an instance of an object.

 

 

It’s doing it again. It sees the AXO.doc file, it creates the AXO.doc, but there is nothing in the AXO.doc file in our directory. Please email me a copy of FAIDAWNT_OD.GLRLETR_02.13047015.AXO so I can see if there is some funk in it that is weirding out the ftp process.

 

There is nothing wrong with the file. I’m pretty sure I’ve found the problem. It was with the way I was defining a path in my first ftp from your directory. There is an AXO.pdf file in the from_faid directory. Plz delete that and let me know when you have. I’ll kick things off from my end, and it should be a wonderful life. When my part of the job finishes, I’ll let you know and you can kick the job off at the beginning of the bdms indexing part.

Mike. B.

 

I have deleted the AXO.pdf file in the from_faid directory.

Joleen.

 

 

 

 

 

 

 

 

Aborted Module Name:  HRMSVSTA.SSH_VPLUS_HIER_01

  Date:        Day:      Time:          Resolution:

04/26/14     Sat         13:12           Restarted by Joleen.

04/26/14     Sat         19:25          Aborted again, see follow up below.

 

Error log and follow up comments:

 

.LOG;   PERL5LIB=/home/jobprd/work/vplusprod.is.colostate.edu.20140426_123841               /home/jobprd/work/vplusprod.is.colostate.edu.20140426_123841/VPLUS_HIERARCHY.PL        --host=vplusprod.is.colostate.edu  --pagesec_command_file_only=N          --config_file=/home/jobprd/work/vplusprod.is.colostate.edu.20140426_123841/VPLUS_HIERARCHY.HRMS.CONFIG --file=/home/jobprd/work/vplusprod.is.colostate.edu.20140426_123841/HRMSVSTA.HRMSS168_01.13085764.BKP --config_file=/dev/null           >> /home/jobprd/work/vplusprod.is.colostate.edu.20140426_123841/HRMSVSTA.VPLUS_HIERARCHY_01.00000000.00000000.00.2014_04_26_1238.LOG 2>&1; '] : (100)

# RETURN CODE : 100

Contacted Rich. We discovered a bad report this weeek and I thought the failure might be related.

Joleen.

 

One of the HRMSMGMT.HRMSR001_01.FY files which was restored by one of the users had a problem this week. The compressed file that gets archived was getting errors on the uncompress.

I contacted support which is looking into this. In the mean time I had to create a special migration rule to get that bad generation off the online reports area. I figured we would be ok since the hierarchy does not do anything on the migrated reports. Well some user must of restored it again and then the hierarchy tried to open it to remove the page security access. It failed on that same report and I confirmed that in the output.

 

So I went through the creation of a rule and issued the migration commands to pull it off the online area.

 

This problem was not related to the one before which I fixed by removing the bad report to archive.

I looked into this error and I could see it was working on report HRMSSUMC_01.HRMSR001_01 adding the full access page security. The error indicated the permission was already there and it failed. The hierarchy process from what I understand should remove all the page securities and permissions and then readd it. I could tell it did not finish by going into Vista and looking at the page securities and access permissions for the groups for the groups.

I called Elden and he said the process should start over and cleanup and readd. I looked at the log for this and it did not show any indication of removing just adding. Elden, search on Full_Access-HRMSSUMC and you will see it did not get removed!!

 

I looked at the last weekend run and I could see it removes and readds. For some reason it was not removing this access for me on restarting the job.

Rich.

 

Running the HRMS Vista Plus hierarchy builder in a manual mode.  It will take a while to run, so I'll check on it later.

 

While the VistaPlus job was running, my location experienced a brief power outage that knocked out my router.  It took a while to get it back so I could connect again. The job finished successfully, so it appears that the report indexing was damaged for the reports related to the previously mentioned page security items.  Reindexing the reports before rerunning the job may have remedied the issue.

Elden.

 

 

Aborted Module Name:   KFSXPDSA.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

04/22/14     Tue         09:12           Restarted by Dermot.

 

Error log and follow up comments:

 

2014-04-22 09:12:05,248 [RMI TCP Connection(20)-129.82.111.82] DEBUG org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl ::

sending email to null for disb # 323616

2014-04-22 09:12:05,261 [RMI TCP Connection(20)-129.82.111.82] ERROR org.kuali.rice.core.mail.MailerImpl :: sendEmail() - Error sending email.

java.lang.NullPointerException

        at javax.mail.internet.InternetAddress.parse(InternetAddress.java:673)

        at javax.mail.internet.InternetAddress.parse(InternetAddress.java:633)

        at javax.mail.internet.InternetAddress.parse(InternetAddress.java:610)

       

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

        at java.lang.Thread.run(Thread.java:662)

2014-04-22 09:12:05,267 [RMI TCP Connection(20)-129.82.111.82] ERROR org.kuali.kfs.pdp.service.impl.PdpEmailServiceImpl :: sendAchAdviceEmail() Invalid

email address. Sending message to John.Swaro@ColoState.EDU

java.lang.RuntimeException: java.lang.NullPointerException

 

Matt was able to locate the invalid email address from the disb # above (for disb # 323616 )

 

This is happening because "Teska, Valerie Anne” had a null email address.  John, are you able to correct that?

Matt.

 

Updated her profile in prod with an email address.  After we get this fixed for today, we need to run a query somehow and see how many active KFS  people do not have an email address that fed over from HR.

John .S.

       

Yeah John you’re right. Since PDP has already bundled the job your change didn’t fix the current problem.  The value that needs to change is “adv_email_addr”  in table “PDP_PMT_GRP_T”.  What is the email supposed to be?  I think we can then have a DB person sql that in for us.

I don’t think there is any way to do it through the app, yes we do need to run a query to weed out any email addresses that are set to null.

Matt.

 

Valerie.Teska@colostate.edu is what I used for the email

John .S.

 

update adv_email_addr to valerie.teska@colstate.edu

of PDP_PMT_GRP_T where disb_nbr=323616

Matt.

 

update pdp_pmt_grp_t set adv_email_addr = 'bfs_accounts_payable@mail.colostate.edu'
where upper(adv_email_addr) = upper('Valerie.Teska@colostate.edu');

Dermot.

 

 

 

 

 

Aborted Module Name:   ADMSBDMS_F3.SSH_SFTP_RC_01

  Date:        Day:      Time:          Resolution:

05/01/14     Thu         00:19          See note from Gudrun below.

 

 

Error log and follow up comments:

 

# > Write failed: Broken pipe

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

 

The file expected to be received has been received. Unless anything wrong with the file please apply after condition and then delete the aborted backlog component

The one AFTER condition still to be applied to this file is:

logrunhost; chmod g+rw /ais01/ftp/from/user/ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

So please log on to kebler and run command prior to deleting the component.

chmod g+rw /ais01/ftp/from/user/ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

 

------  File found ---------

$ ls -ltr /ais01/ftp/from/user/ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE*.zip

-rw-rw----    1 jobprd   Gftp       23166976 May 01 02:00 /ais01/ftp/from/user/ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

$ hostname

Kebler

$ pwd

/home/jobprd

Gudrun.

 

Aries team does not have access to verify the file. Is a restart out of the question for this job?

Joleen.

 

Received zip file has been determined to be bad. 

ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

$ unzip ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

Archive:  ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

  End-of-central-directory signature not found.  Either this file is not

  a zipfile, or it constitutes one disk of a multi-part archive.  In the

  latter case the central directory and zipfile comment will be found on

  the last disk(s) of this archive.

unzip:  cannot find zipfile directory in one of ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip or

        ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip.zip, and cannot find ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip.ZIP, period.

$ ls -ltr

total 0

-rw-rw----    1 jobprd   Gjob       23166976 May 01 08:43 ADMSBDMS_F3.ADMSBDMS.CSUS_ADMS_SLATE_DOCS_20140430.zip

Gudrun.

 

 

 

 

Aborted Module Name:  HRMSMTH2.HRMSR233_01

  Date:        Day:      Time:          Resolution:

05/01/14     Thu         21:03          No follow up received.

 

Error log and follow up comments:

 

arguments

------------

Run_Date='MAY-2014'

P_Action='F'

------------

 

The following process flow is in Aborted Status

               HRMSMTH2.HRMSR233_01

                             

 

Current NLS_LANG and NLS_NUMERIC_CHARACTERS Environment Variables are :

American_America.US7ASCII

 

''

 

Enter Password:

ORA-01427: single-row subquery returns more than one row

       ( ==> SELECT element_name

REP-0069: Internal error

REP-57054: In-process job terminated:Terminated with error:

REP-300: single-row subquery returns more than one row

       ( ==> SELECT element_name

 

 

 

Aborted Module Name:  HRMSRPTS_FW.HRMSR105S_01

  Date:        Day:      Time:          Resolution:

05/01/14     Thu         21:19          Restarted by Robin.

 

Error log and follow up comments:

 

 

arguments

------------

p_sort_by='EMP#'

------------

Execution options

VERSION=2.03b

 

The following process flow is in Aborted Status    

               HRMSRPTS_FW.HRMSR105S_01 

             

 

Current NLS_LANG and NLS_NUMERIC_CHARACTERS Environment Variables are :

American_America.US7ASCII

 

''

 

Enter Password:

REP-0069: Internal error

REP-57054: In-process job terminated:Finished successfully but output is voided

 

Please restart this job.  It should finish up just fine.

Bob V.

 

 

 

Aborted Module Name:  AREGDRGC_SP.WAIT_FOR_DARS_01

  Date:        Day:      Time:          Resolution:

05/09/14     Fri          07:50          See follow up below.

 

Error log and follow up comments:

 

 

AREGDRGC_SP.WAIT_FOR_DARS_01 has aborted. There is no output file.

All I know is when I run the query below that it is coming back with an error status. Sorry, I know that isn’t much to go on.

 

SELECT status from job_queue_list

where jobid = 'ba14050900052929';

Joleen.

 

It looks like the whole thing aborted because one of the audits failed. There is one student (829875313) that has a degree program that appears to have been mistyped or something. From Banner, the degree program that’s being pulled in is FESA-DD-BS, but there is no such degree program defined in u.achieve. There is however one named FESV-DD-BS, which I think is what it was intended to be. Could someone on the RO side fix the program for the student in Banner?

Zach.

 

This record has been fixed. You were exactly right Zach. That program changed name about a year ago.

Hopefully it can run successfully now.

Jamie.

 

Since the audit already failed, I think the student will need to be set up again to run from the beginning. I think when it fails at this point, the whole lot of the audits that were supposed to run may even need to be reset. The batch jobid has a status of E, so even the 8 audits that were successful were never passed to the next step in the process to be formatted for printing.

In the past, when AREGDRGC aborts on WAIT_FOR_DARS because a batch jobid status of E, is this what you’ve had to do?

Zach.

 

You are right Zach. We’ll have to delete the DARSCR and DARSGC for anyone that had it posted from the job last night. Can you send us those Ids?

Jamie.

 

I accidentally said it was a total of nine audits in my last email, but it was actually 10 total.

There was one each for 824136755, 824556286, 828944733, 828126699, 828112720, and the one that had the error, 829875313.

Then there were two each for 828303043 and 829995636, making it 10 total. Let me know if you need any additional info.

Zach.

 

The datacodes have been deleted for the students identified below.

Jamie.

 

Great. Joleen, can you stop the AREGDRGC process after WAIT_FOR_DARS where it aborted and re-run the whole thing again?

Zach.

 

I can do that. I will remove the aborted process flow and bring in a new one. I will let you know when the new one has finished running.

Joleen.

 

 

 

 

 

 

 

Aborted Module Name:  AREGDRDY.UACHIEVE_LINK_0

  Date:        Day:      Time:          Resolution:

05/13/14     Tue          20:02         Restarted  by Joleen.

 

Error log and follow up comments:

 

 

+ cd /app/uachieve/csu-transfer-bridge/bin

+ APP_CLASSPATH=../config:../bin/lib:../bin/lib/*:../usr/*:../usr:

+ rm -ef /ais01/dat/work/prod/transfer-bridge-report*.txt

+ java -Xms64m -Xmx256m -classpath ../config:../bin/lib:../bin/lib/*:../usr/*:../usr: uachieve.transferbridge.TransferBridge

Exception in thread "main" java.lang.NullPointerException

               at uachieve.transferbridge.course.extractor.impl.BannerCourseExtractor.createSourceCourse(BannerCourseExtractor.java:229)

               at uachieve.transferbridge.course.extractor.impl.BannerCourseExtractor.getSourceCourses(BannerCourseExtractor.java:157)

               at uachieve.transferbridge.course.extractor.impl.BannerCourseExtractor.loadCourses(BannerCourseExtractor.java:116)

 

 

Can you send me the output files created by the job? I’m thinking they’re could be something wrong with a course number for one of the students.

Zach.

 

Zach used the data in the transfer-bridge-report located in the /ais01/dat/work/prod to trouble shoot the error.

I restarted the job and it has finished.

Joleen.

 

 

Aborted Module Name:   ADMSODSL.LYNX_01

  Date:        Day:      Time:          Resolution:

05/23/14     Fri          04:06          Job removed  by Joleen.

 

Error log and follow up comments:

 

====================================================================

From: ADMSODSL_BULK_COPY

Date: 05/23/2014_04:06:00  Schedule/ID: ADMSODSL/13300027 ====================================================================

*** Follow-up Required -- ADMSODSL_BULK_COPY has failed       

=================================================================

Admissions:

Please reply to this email with one of the following options:

_ _ Delete the ABORTED LYNX job

       Admissions has manually reran it

or

_ _ Restart the ABORTED LYNX job

 

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu Secure 128-bit TLSv1/SSLv3 (AES128-SHA) HTTP connection Sending HTTP request.

HTTP request sent; waiting for response.

Retrying as HTTP0 request.

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu Retrying connection.

Looking up wsnet.colostate.edu

Making HTTPS connection to wsnet.colostate.edu

Alert!: Unable to make secure connection to remote host.

lynx: Can't access startfile https://wsnet.colostate.edu/ai/appworx/bulkcopy.aspx

Joleen.

 

We don't need to re-start this web job.

Erica.

 

I have removed the aborted job.

Joleen.

 

 

 

 

Aborted Module Name:   ADMSAPPL.CONVERT_TEXT_01

  Date:        Day:      Time:          Resolution:

06/09/14     Tue         22:05           Restarted by David.

 

Error log and follow up comments:

 

 

+ cat /ais01/dat/work/prod/ADMSAPPL.CONVERT_TEXT_01_jobstat

***

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

ERROR: UTF-8 to iso-8859-1 conversion error: A file or directory in the path name does not exist. at -e line 11, <> line 1.

 

I deleted this per Kathy. Looks like the App files were empty on their end? I am allowing ADMSAPPL to proceed per Kathy.

David.

 

 

 

Aborted Module Name:  AREGDYAD.WEB_API_01

  Date:        Day:      Time:          Resolution:

06/11/14     Thu         20:04           Restarted by David.

 

Error log and follow up comments:

 

 

LWP::UserAgent::request: Simple response: Internal Server Error

# 20140611-200554 : [UAResponse]    | ----------------------------------------------------------------------------------------------------

# 20140611-200554 : [UAResponse]    | 500 Connect failed: connect: A remote host did not respond within the timeout period.; A remote host did not respond within the timeout period.

# 20140611-200554 : [UAResponse]    | Client-Date: Thu, 12 Jun 2014 02:05:54 GMT

# 20140611-200554 :                 | ----------------------------------------------------------------------------------------------------

# 20140611-200554 : *** FATAL ***   | Base_API::_exit < ScriptInterface::_exit : stack

 

Is it okay for us to re-start this ABORT?

Dermot.             

 

I re-started and it is complete.

David.

 

 

 

 

Aborted Module Name:   AREGTTRN.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

06/11/14     Thu         20:04           Restarted by David.

 

 

Error log and follow up comments:

 

 

AREGTTRN.SSH_SFTP_RC_01 / SFTP_RC_RN is stalled on AWPROD with an EMPTY FILE error, see message below:

 

# > ssh: connect to host iwantmytranscript.com port 22: Connection timed out

# > Connection closed

# > (255)

#==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

 

Is it okay for scheduling to restart this job?

Dermot.

 

I re-started this step after determining the file still existed on the transcript end, but had not been downloaded to kebler yet.

David.

 

 

 

Aborted Module Name:  ADMSODSL.LYNX_01   

  Date:        Day:      Time:          Resolution:

06/13/14     Fri          03:22           Deleted by David.

 

Error log and follow up comments:

 

 

ADMSODSL.LYNX_01 / ADMSODSL_BULK_COPY ABORTED in AWPROD.

 

Alert!: Unable to make secure connection to remote host.

 

lynx: Can't access startfile https://wsnet.colostate.edu/ai/appworx/bulkcopy.aspx

***

[101] : *** ERROR Detected in Output : File Empty ***

+ err=101

 

***

wc: 0653-755 Cannot open /appworx/out/ADMSODSL.LYNX_01.13468499.00.status.txt.

***

 

Output file is attached, please advise on how we should proceed.

 

Do you want us to just try and restart this or do you have any other insight?

Vicki.

 

Just delete the aborted job. No need to restart.

Erica.

 

 

 

 

Aborted Module Name:   ADMSBSLT.LYNX_01

  Date:        Day:      Time:          Resolution:

06/13/14     Fri          22:26           Restarted by Dermot.

07/02/14     Wed       11:32           Deleted by David.

 

Error log and follow up comments:

 

06/13/14.

Can you restart the aborted lynx job? Do we know what might have caused this?

Erica.

 

SCAN STATUS REGEX = [STATUS=.*OK]

     109 /appworx/out/LYNX_13478578.00.stdout.txt

       0 /appworx/out/ADMSBSLT.LYNX_01.13478578.00.stderr.txt

       2 /appworx/out/ADMSBSLT.LYNX_01.13478578.00.status.txt

     111 total

*** /appworx/out/ADMSBSLT.LYNX_01.13478578.00.stderr.txt ***

***

*** /appworx/out/ADMSBSLT.LYNX_01.13478578.00.status.txt ***

   URL=https://wsnet.colostate.edu/ai/appworx/BannerFeedToSlate.aspx (GET)

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

07/02/14.

 

STATUS=HTTP/1.1 500 Internal Server Error

***

[100] : *** ERROR Detected in Output : Status not OK ***

+ err=100

 

I attached the LYNX stdout file that shows the connection error.

Erica needs to change her setup to allow for the login as well.

Gudrun.

 

Erica,

Do you know what needs to be changed on your side as well?

Vicki.

 

Yes, I'm waiting for slate to white list the server. I manually moved the file for now.

Erica.

 

Erica and Trish, not sure how long this is going to take but can we bypass this part of the program and run the rest of the schedule?  The processing unit has contacted me to see why they haven't received their graduate applications and letters.  Wanted to be able to give them an update on when that might happen.

Marcella.

 

Please abort and continue running the schedule.

Erica.

 

ADMSBSLT.LYNX has been deleted. ADMSBDMS_F3.SSH_SFTP_01 connection to technolutions was updated.

It was able to connect but did not apparently find any files.

David.

 

 

 

 

 

Aborted Module Name:  AREGHINS_FA.AGENS031_01

  Date:        Day:      Time:          Resolution:

06/13/14     Fri          18:03           Restarted by Joleen.

Error log and follow up comments:

 

ERROR at line 1:

ORA-01422: exact fetch returns more than requested number of rows

ORA-06512: at line 667

ORA-06512: at line 1833

 

18:03:46 664                        UTL_FILE.put_line ( out_report_handle, 'Acension import file error' || SQLERRM || csuid);

18:03:46 665                       error_count := error_count + 1;

18:03:46 666                    ELSE

18:03:46 667                       RAISE;

18:03:46 668                    END IF;

18:03:46 669              END get_line_block;

 

18:03:46 1831 

18:03:46 1832     --------------------------------------------------------------------------

18:03:46 1833      p_import_3rd_party_file;   -- 1. import/upload 3rd party file from acension

18:03:46 1834 

18:03:46 1835      p_update_datacode;             -- 2. update datacode with banner table data

18:03:46 1836 

18:03:46 1837      p_create_3rd_party_file (v_current_term); -- 3. create output file for 3rd party

18:03:46 1838

 

+ /appworx/exec/FILESIZE AREGHINS_FA.AGENS031_01.13481221.13481228.00.2014_06_13_1803.jobout 100

no output from AREGHINS_FA.AGENS031_01

+ err=100

 

Erin made a fix and AREGHINS_FA.AGENS031_01 is complete.

Joleen.

 

 

 

Aborted Module Name: AROSDPA1.TGRUNAP_01    

  Date:        Day:      Time:          Resolution:

 06/13/14     Fri          20:18           Restarted by Joleen.

 

Error log and follow up comments:

 

 

Starting TGRUNAP (Version 8.0.1)

 

RUN SEQUENCE NUMBER:

ORA-03113: end-of-file on communication channel Process ID: 9634084 Session ID: 1811 Serial number: 48325

 

WRN-ORACERR: Error occurred in file "tgrunap.pc" at line 1,753

WRN-ERRSTMT: Following statement was last statement parsed:

    select tbraccd_pidm ,tbraccd_term_code ,null   from tbbdetc c ,tbraccd

tgrunap terminated with error

0 lines written to /appworx/out/tgrunap_3455797.lis

 

 

Please restart this job.  This is a Banner process the deals with the application of payments.

Steven Dove.

 

 

Aborted Module Name: ODSRKFSX.ODSRS002_01  

  Date:        Day:      Time:          Resolution:

06/16/14     Mon        23:38           Restarted by James.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-20000: ERROR running LOAD_CSUF_ACCOUNT_MONTHLY_SNAP

ORA-06512: at "CSUADMIN.CSUG_RUN_OWB_TASK", line 60

ORA-06512: at line 234

 

60               dbms_mview.refresh('CSUBAN.CSUS_INTERDISCIPLIN_PROGRAM_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

61               dbms_mview.refresh('CSUBAN.CSUG_EADM_EMAIL_ADDRESS_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

 

233              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_ORG_MONTHLY_SNAP_T');

234              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_CSUF_ACCOUNT_MONTHLY_SNAP');

235              csug_run_owb_task('OWBREP', 'ODS_CSUKFS_LOCATION', 'PLSQL', 'LOAD_GL_BALANCE_SUMMARY_ADJ_T');

 

 

Here is the error on the DB:

ORA-12899: value too large for column "CSUKFS"."CSUF_ACCOUNT_MONTHLY_SNAP_T"."DIVISION_CD" (actual: 4, maximum: 2)

Mark P.

 

There are several potential fields in KFS where that value could come from.

Can you tell me what table and column the mapping is looking at?

Josh.

 

I am looking into the mapping now and will let you know where it is coming from.

Kathy G.

 

 

 

 

Aborted Module Name:  FAIDDYNT_OD.RORBPST_01

  Date:        Day:      Time:          Resolution:

06/17/14     Tue         21:30           Restarted by Joleen.

 

Error log and follow up comments:

 

ORA-02291: integrity constraint (GENERAL.FK1_GURMAIL_INV_GTVLETR_CODE) violated - parent key not found

 

WRN-ORACERR: Error occurred in file "rorbpst.pc" at line 7,089

WRN-ERRSTMT: Following statement was last statement parsed:

    insert into GURMAIL (GURMAIL_AIDY_CODE,GURMAIL_PIDM,GURMAIL_LETR_CODE,

rorbpst terminated with error

419 lines written to /appworx/out/rorbpst_3458460.lis

 

 

I will have to take a look when I get in - I can't see what's wrong on my phone :) It's probably something I didn't do right.

Candy.

 

Candy fixed the problem and had me restart. FAIDDYNT_DAILY_NOTIFICATION has finished running.

Joleen.

 

 

Aborted Module Name:   FAIDAM62_FAIDPROF_SCHEDS_DONE

  Date:        Day:      Time:          Resolution:

06/18/14     Wed        06:06          Long running Job.

 

Error log and follow up comments:

 

 

FAIDPROF send mail had aborted with this error:

FATAL : File (/ais01/dat/work/prod/FAIDPROF_OD.RNEIN15_02.TXT) NOT FOUND

 

An after condition on FAIDPROF_OD.RNEIN15_02 is supposed to copy the file to work dat. That condition didn't work because I didn't see the file in work dat. I don't have access to the appworx out directory so I ran COPY_JOBLOG to copy the file over to work dat. I restarted the job after that and it finished. I didn't see any mention in the prefix script in the send mail output that the after condition failed. Hopefully it was just a glitch.

 

 

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Wednesday, June 18, 2014 6:06 AM

To: 9705815577@tmomail.net

Cc: IS DL: Alert APMX

Subject: AWPROD APMXCHKS Long-Running Flow Backlog Warning

 

====================================================================

From: APMXCHKS.APMXCHK_LONG_FLOW

Date: 18-Jun-2014_06:06:03  Schedule/ID: 13509560                         

====================================================================

 

Wed Jun 18 06:05:16 MDT 2014                                                                             Page 1

                      Check backlog for long-running process flows (>300% but min 30min)                       

___________Process_Flow____________    ChainId  Elpsd Sec  Elpsd Min  Avg Elpsd Secs  Avg Elapsd Min Percentage

----------------------------------- ---------- ---------- ---------- --------------- --------------- ----------

              FAIDPROF_PROFILE_LOAD   13505713       2644         44             707              12     374.08

      FAIDAM62_FAIDPROF_SCHEDS_DONE   13505783       2642         44             848              14     311.61

 

 

 

Aborted Module Name:  FAIDPROF_PROFILE_LOAD

  Date:        Day:      Time:          Resolution:

06/18/14     Wed        06:06          Long running Job.

 

Error log and follow up comments:

 

 

FAIDPROF send mail had aborted with this error:

FATAL : File (/ais01/dat/work/prod/FAIDPROF_OD.RNEIN15_02.TXT) NOT FOUND

 

An after condition on FAIDPROF_OD.RNEIN15_02 is supposed to copy the file to work dat. That condition didn't work because I didn't see the file in work dat. I don't have access to the appworx out directory so I ran COPY_JOBLOG to copy the file over to work dat. I restarted the job after that and it finished. I didn't see any mention in the prefix script in the send mail output that the after condition failed. Hopefully it was just a glitch.

 

 

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Wednesday, June 18, 2014 6:06 AM

To: 9705815577@tmomail.net

Cc: IS DL: Alert APMX

Subject: AWPROD APMXCHKS Long-Running Flow Backlog Warning

 

====================================================================

From: APMXCHKS.APMXCHK_LONG_FLOW

Date: 18-Jun-2014_06:06:03  Schedule/ID: 13509560                          

====================================================================

 

Wed Jun 18 06:05:16 MDT 2014                                                                             Page 1

                      Check backlog for long-running process flows (>300% but min 30min)                      

___________Process_Flow____________    ChainId  Elpsd Sec  Elpsd Min  Avg Elpsd Secs  Avg Elapsd Min Percentage

----------------------------------- ---------- ---------- ---------- --------------- --------------- ----------

              FAIDPROF_PROFILE_LOAD   13505713       2644         44             707              12     374.08

      FAIDAM62_FAIDPROF_SCHEDS_DONE   13505783       2642         44             848              14     311.61

 

 

 

Aborted Module Name:   ADMSAPPL.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

07/02/14     Wed        06:10          Restarted by David.

 

Error log and follow up comments:

# > ssh: connect to host sftp.technolutions.net port 22: Connection timed out # > Connection closed # > (255) #==============================================================================

# FATAL : Command failed with code : 255

#------------------------------------------------------------------------------

# RETURN CODE = 100

I noticed the abort last night. I tried restarting and it failed with the timed out message. I tried restarting this morning and it failed again. Is there anything we can do or maybe I should ask ADMS to contact technolutions?

Joleen.

 

Trish and Erica,

Any word on why we cannot connect to Technolutions / Slate?

Vicki.

 

We've submitted a service request to Technolutions.  We'll let you know what we find out.

Trish.

 

No change in connectivity to that server.

Since both Joleen and David are out today I just tried again from the command line as jobprd. Pinging server sftp.technolutions.net works but sftp connection using SSH authentication fails. Their sftp service may be down.

$ sudo su - jobprd

$ /usr/bin/sftp  -oIdentityFile="  <actual ssh id file> "  csu@sftp.technolutions.net         <

ssh: connect to host sftp.technolutions.net port 22: Connection timed out Connection closed $ $ ^X $ hostname Kebler $ whoami jobprd

$ ping sftp.technolutions.net

PING cluster-nlb.technolutions.net: (72.9.129.38): 56 data bytes

64 bytes from 72.9.129.38: icmp_seq=0 ttl=115 time=52 ms

64 bytes from 72.9.129.38: icmp_seq=1 ttl=115 time=56 ms

64 bytes from 72.9.129.38: icmp_seq=2 ttl=115 time=50 ms

Gudrun.

 

Technolutions has retired their legacy FTP servers.  We have information on our new requirements that I will get you when I get back to my desk.

Trish.

 

I worked with Elden and the new sftp credentials are available and I thought in use by IS.

Please let me know if we can assist on your end. We are using a SSH2 public key.

Protocol      SFTP

Host   ft.technolutions.net

Port   22

Username      csu@connect.colostate.edu

Command Line  sftp csu=connect.colostate.edu@ft.technolutions.net

Erica.

 

ADMSAPPL.SSH_SFTP_01 abort has been resolved.

Permanent change to new user has been made as well for process flow ADMSAPPL SSH_SFTP components.

Gudrun.

 

 

Aborted Module Name:  KFSXGLBF_PE.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

07/02/14     Wed        06:10          Deleted, to be re-run after build.

 

Error log and follow up comments:

 

 

Caused by: org.apache.ojb.broker.PersistenceBrokerException: org.apache.ojb.broker.OJBRuntimeException: Incorrect or not found field reference name 'ac countNumber' in desc

riptor org.apache.ojb.broker.metadata.CollectionDescriptor@5dc07cfa[cascade_retrieve=true,cascade_store=object,cascade_delete=object,is_lazy=true,class_of_Items=class org.k

uali.kfs.coa.businessobject.PriorYearIndirectCostRecoveryAccount] for class-descriptor 'org.kuali.kfs.coa.businessobject.PriorYearIndirectCostRecoveryAccount'

        at org.apache.ojb.broker.core.proxy.AbstractIndirectionHandler.materializeSubject(Unknown Source)

        at org.apache.ojb.broker.core.proxy.AbstractIndirectionHandler.getRealSubject(Unknown Source)

        ... 65 more

Caused by: org.apache.ojb.broker.OJBRuntimeException: Incorrect or not found field reference name 'ac countNumber' in descriptor org.apache.ojb.broker.metadata.CollectionDe

scriptor@5dc07cfa[cascade_retrieve=true,cascade_store=object,cascade_delete=object,is_lazy=true,class_of_Items=class org.kuali.kfs.coa.businessobject.PriorYearIndirectCostR

ecoveryAccount] for class-descriptor 'org.kuali.kfs.coa.businessobject.PriorYearIndirectCostRecoveryAccount'

        at org.apache.ojb.broker.metadata.ObjectReferenceDescriptor.getForeignKeyFieldDescriptors(Unknown Source)

 

It looks like a typo in the ojb-coa.xml file. Take a look at this XML excerpt.

 

        <collection-descriptor name="indirectCostRecoveryAccounts"

                               element-class-ref="org.kuali.kfs.coa.businessobject.PriorYearIndirectCostRecoveryAccount"

                               collection-class="org.apache.ojb.broker.util.collections.ManageableArrayList"

                               auto-retrieve="true" auto-update="object" auto-delete="object" proxy="true">

            <orderby name="priorYearIndirectCostRecoveryAccountGeneratedIdentifier" sort="ASC"/>

            <inverse-foreignkey field-ref="chartOfAccountsCode"/>

            <inverse-foreignkey field-ref="ac countNumber"/>

        </collection-descriptor>

    </class-descriptor>

 

There’s a space in the last string there, “ac countNumber”. This looks to be new in 5.0.3, so that would explain why it’s never happened before. It’s an easy fix, but I think we’ll have to rebuild KFS prod to make it work. Could we test it somewhere first, or is it difficult to get things set up with the end of year data in a test environment?

-Zach.

 

 

 

Aborted Module Name:   ADMSBDMS_F3.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

07/02/14     Wed        11:43          Restarted by David.

 

Error log and follow up comments:

 

 

# FATAL : Command failed with code : 100

#------------------------------------------------------------------------------

# RETURN CODE = 100

#==============================================================================

  Child: Job return = 100

 2 11:44:59-  Child: put to memory:[100]

 

 2 11:44:59-  Child: In memory:[100]

 

  Child:Done.

 2 11:44:59-Child:Done

 2 11:45:04-Parent: (8)Checking child process(6422724)

 2 11:45:04-Parent: Child process[6422724] done.

 2 11:45:04-Parent: Checking child mem

 2 11:45:04-Parent: Value in mem [100]

 2 11:45:04-Parent: Child process returned a value.

 2 11:45:04-Parent: child process done.

 2 11:45:04-Parent:Value in mem [100]

 2 11:45:04-Deleting kill file if exists [/appworx/run/jobpid.13632051.00]

 2 11:45:04-Deleting flag file if exists [/appworx/run/jobpid.13632051.00]

 2 11:45:04-Getting env 'SURUNEXIT'

 

Trish and Erica

What have we been doing for bad BDMS documents from Slate?  Joleen and Kathy are not here right now?

Vicki.

 

Peter sends them to me.

Erica.

 

Prior to that I fixed ADMSBDMS_F3.SSH_SFTP_01. Same fix as before. Applied permanent change as well. Ok the unexpected change hopefully has worked itself through the process flows. We should be better off tomorrow.

Gudrun.

 

I have also modified ADMSBDMS_F3, ADMSSXML, and the prompt values for the SFTP-RC-DL-LOOP components in these chains and ADMSAPPL. Hopefully we have them all changed.

(I ran an Appman report for prompts with the 'technolution' text in them.)

David.

 

 

 

 

Aborted Module Name:  AROSDTRN.CHAIN_FINISH_01

  Date:        Day:      Time:          Resolution:

07/07/14     Mon        20:16          Restarted by Joleen.

 

Error log and follow up comments:

 

Maybe some follow-up is needed for this abort?

Since the file was empty, I restarted the CHAIN_FINISH so the AROS schedule could continue.

(I checked the file in backup and it was empty, also)

 

Abort Error:

*** SEARCH OF JOBLOG FOR ERROR STRINGS FOUND THE FOLLOWING:

***

cat: 0652-050 Cannot open /userfiles/Uhous/data/AROSDTRN.AROSP001_01.HOUS.TRANS.

***

*** END SEARCH OF JOBLOG FOR ERROR STRINGS

 

It is strange that the collect file message said the file was not found and for the Useedlab it said the file was empty.

 

 

=================================================================

07/07/14 19:49:52

Source File Path:  /userfiles/Uhous/data

       File Name:  AROSDTRN.AROSP001_01.HOUS.TRANS

****  W A R N I N G:  FILE NOT FOUND  ****

=================================================================

 

=================================================================

07/07/14 19:49:52

Source File Path:  /userfiles/Useedlab/data

       File Name:  AROSDTRN.AROSP001_01.SEEDLAB.TRANS

****  FILE IS EMPTY  ****

=================================================================

Joleen.

 

I think this means that the file did not exist. Normally, there should be an empty file even if there is no data.

 It is normal for there to be no data sometimes, but not sure why the empty file was not there.

David.

 

 

Aborted Module Name: ADMSSRLD_DY.SURLOAD_02   

  Date:        Day:      Time:          Resolution:

07/08/14     Tue        22:35          Restarted by Joleen.

 

Error log and follow up comments:

 

 

ORA-02291: integrity constraint (GENERAL.FK1_GLBEXTR_INV_GLBSLCT_KEY) violated - parent key not found

WRN-ORACERR: Error occurred in file "surload.pc" at line 705

WRN-ERRSTMT: Following statement was last statement parsed:

insert into GLBEXTR (glbextr_key,glbextr_application,glbextr_selection

surload terminated with error

11 lines written to /appworx/out/surload_3479652.lis

 

08-JUL-2014 22:35:31                                 Colorado State University                                            PAGE 1

999999                                                  Communication Load                                                8.2

                                                       AUTOMATED GURMAIL LOAD

                                                            Load Errors

PIDM      Id                            Name                     System Indicator     Description/Comments

010998112 827332500 Sorel , Keelie C                             S                    Not Loaded; Duplicate Non-printed letter

 

 

Kathy asked me to restart. ADMSSRLD_LOAD_LETTERS has finished running.

Joleen.

 

 

 

Aborted Module Name:   KFSXGLCL_D1.SEND_MAIL_02 (SEE PAGE 476 ALSO)

  Date:        Day:      Time:          Resolution:

07/11/14     Fri          19:25          Resbumitted by Gudrun.

 

Error log and follow up comments:

 

 

# reply_to=""

# to="/ais01/dat/misc/mailst/SEND_MAIL.BFS_KUALI_IMPLEMENTATION.LST"

# cc="/ais01/dat/misc/mailst/SEND_MAIL.KFSX.ALERT.LST"

# bcc="kfsprd KFSXGLCL.Uadms.sla_app_07102014.xml FILE REJECTED"

# subject=""

# --expand_aw_ms_options="/ais01/dat/misc/mailst/SEND_MAIL_TEMPLATE.KFSX_PRIOR_FY_REJECT.TXT"

#   --> --options=" ERROR -999 ORA-01722: invalid number"

#   --> --options=""

# > (3)

 

The D1 completed with no history of the ABORTED step but above aborted step remains in backlog!

Dermot.

 

Looks like validation for one of your collected files also failed.  Checking on subject missing cause.

 

 

Something in the COLLECT_FILES job is amiss for this particular case of sending a validation failure email.

COLLECT_FILES drives submission of the KFSX_JAVA job used to validate files. If file is bad an email is sent – in this case an email alerting of

prior fiscal year being used. I believe a change needs to be made to the args  passed in COLLECT_FILES for this particular aborted SEND_MAIL awrun command.  It may not have happened beforeso did not get caught.

Value /ais01/dat/misc/mailst/SEND_MAIL_TEMPLATE.KFSX_PRIOR_FY_REJECT.TXT should go to the next prompt line.

I will manually recreate the SEND_MAIL job and submit and delete the abort now put on hold SEND_MAIL_R job. I can’t fix Options value in backlog for some reason.

 

 

 

Aborted Module Name: KFSXGLCL_D1.SEND_MAIL_02 (CONTINUED) 

  Date:        Day:      Time:          Resolution:

07/11/14     Fri          19:25          Resbumitted by Gudrun.

 

Error log and follow up comments:

 

 

Here is the code from COLLECT_FILES that created the aborted SEND_MAIL job with _R alias.

print "****  R E J E C T E D - Contains Prior Fiscal Year" \

            >> ${collect_driver_done_summary}

     print "****               and  Non-Allowable Origin Code" \

            >> ${collect_driver_done_summary}

     print "****  Bad File Path:  ${collect_source_path}" \

          >> ${collect_driver_done_summary}

     print "****      File Name:  ${collect_source_file_no_path}.bad_fy" \

          >> ${collect_driver_done_summary}

     awrun SEND_MAIL \

           -v ${this_chain_name}.SEND_MAIL_${iteration_cnt}_R \

           -arg jobprd@mailer.is.colostate.edu \

           _NULL_ \

           ${mailst}/SEND_MAIL.BFS_KUALI_IMPLEMENTATION.LST \

           ${this_email_cc} \

           ${mailst}/SEND_MAIL.KFSX.ALERT.LST \

           "${ORACLE_SID} ${collect_source_file_no_path} FILE REJECTED" \

           _NULL_ \

           ${mailst}/SEND_MAIL_TEMPLATE.KFSX_PRIOR_FY_REJECT.TXT \

           ${collect_source_path} \

           ${collect_source_file_no_path}.bad_fy

 

Compare this to our regular email for an ordinary file validation failure  _V

   # so validate error can be emailed.

       if (( ${collect_rec_cnt} > 1 )) && \

          (grep 'emailAddress' ${this_xml_fn} > /dev/null)

       then

           this_email=$(grep 'emailAddress' ${this_xml_fn} | cut -f 2 -d '>' | cut -f 1 -d '<')

           awrun SEND_MAIL \

           -v ${this_chain_name}.SEND_MAIL_${iteration_cnt}_V \

           -arg jobprd@mailer.is.colostate.edu \

           _NULL_ \

           ${this_email} \

           ${mailst}/SEND_MAIL.KFSX.ALERT.LST \

           _NULL_ \

           "${ORACLE_SID} ${collect_source_file_no_path} FILE VALIDATION FAILED" \

           _NULL_ \

           ${mailst}/SEND_MAIL_TEMPLATE.KFSX_VALIDATE.TXT \

           ${collect_source_path} \

           ${collect_source_file_no_path}.bad \

           ${collect_source_file_no_path}.bad_msg \

 

Received call from Dermot Barrett that AppMan job KFSXGLCL_D1.SEND_MAIL_02_R aborted with no subject.

Resolved abort by manually submitting an email job with proper prompt values. Deleted aborted original email job. Turning in ticket for COLLECT_FILES script to be changed on Monday.

 

 

 

Aborted Module Name: ODSRAGEN.ODSRS002_01 / ODSRAGEN_REFRESH_AGEN_ODSPROD    

  Date:        Day:      Time:          Resolution:

06/28/14     Sat         2:47           Modify the ODSRS002.sql file and put it in the temp.

 

Error log and follow up comments:

 

ODSRAGEN.ODSRS002_01 / ODSRAGEN_REFRESH_AGEN_ODSPROD is in ABORT status on AWPROD.

 

ORA-12008: error in materialized view refresh path

ORA-01427: single-row subquery returns more than one row

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2563

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2776

ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2745

ORA-06512: at line 64

 

60               dbms_mview.refresh('CSUBAN.CSUS_INTERDISCIPLIN_PROGRAM_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

61               dbms_mview.refresh('CSUBAN.CSUG_EADM_EMAIL_ADDRESS_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

62               dbms_mview.refresh('CSUBAN.CSUS_STUDENT_FEES_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

63               csug_run_owb_task('OWBREP', 'ODS_CSUBAN_LOCATION', 'PLSQL', 'LOAD_CSUS_MAJOR_CCHE_CIPC');

64               dbms_mview.refresh('CSUBAN.CSUS_STUDIO_ABROAD_MV','C','',TRUE, FALSE, 0,0,0,FALSE);

65               csug_run_owb_task('OWBREP', 'ODS_CSUBAN_LOCATION', 'PLSQL', 'LOAD_CSUS_PROGRAM_INFO');

66               csug_run_owb_task('OWBREP', 'ODS_CSUBAN_LOCATION', 'PLSQL', 'LOAD_CSUS_PROGRAM_INFO_INDEPT');

 

There is a data problem with the CSUBAN.CSUS_STUDIO_ABROAD_MV. 

Student (person_uid = '11247125') has a duplicate primary advisor in the ADVISOR view.

I am testing a solution right now and if it works I will modify the ODSRS002.sql file and put it in the temp directory so we only have run the remaining MV’s and mappings.

Update coming soon…

Mark P.

Ok the fixed worked and I have copied a modified version of ODSRS002.sql into /ais01/src/sql/temp.  The ODSRS002.sql will need to be deleted after the ODS refresh completes today.

Please start the REFRESH_STUDENT_GENERAL_CSU appman process.

Mark P.

 

 

 

 

Aborted Module Name:   KFSXBCFO.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

07/18/14     Fri          09:09          Resbumitted by Joleen.

 

 

Error log and follow up comments:

 

 

2014-07-18 09:07:45,475 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: javax.xml.ws.soap.SOAPFaultException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: org.kuali.rice.coreservice.impl.parameter.ParameterBo@3dace634[namespaceCode=KFS-BC,componentCode=BudgetConstruction,name=RUN_CENTRALIZED_FRINGE_BATCH_JOB,applicationId=KUALI,value=N,description=Determines whether or not centralized fringe benefits should be calculated.,parameterTypeCode=VALID,evaluationOperatorCode=A,versionNumber=31,objectId=024C69BD-C258-186C-169A-05EED2012312,newCollectionRecord=false]

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

 

 

Doesn’t want to allow a change to RUN_CENTRALIZED_FRINGE_BATCH_JOB parm. 

Mike.

 

 

Aborted Module Name: KFSXBCFO.KFSX_KFSXS048_01

  

  Date:        Day:      Time:          Resolution:

07/18/14     Fri          10:27          Restarted by Dermot.

 

 

Error log and follow up comments:

 

 

SP2-0135: symbol subfunds is UNDEFINED

SQL> select                 '(''' || replace(txt, ';' ,''',''') || ''')'

  2  as subfunds

  3  from krns_parm_t

  4  where parm_dtl_typ_cd = 'BudgetConstruction'

  5  and parm_nm = 'CENTRALIZED_FRINGE_SUBFUNDS';

from krns_parm_t

     *

ERROR at line 3:

ORA-00942: table or view does not exist

 

 

 

 

Aborted Module Name:   KFSXCMYE_CAM_YEAR_END_DEPREC

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

Description: Description: Description: Description: cid:image001.png@01CFA1C2.ADF389E0

 

 

Aborted Module Name:   FAIDPROF_OD.FAIDIDSM_01

  Date:        Day:      Time:          Resolution:

07/22/14     Tue         05:33          Restarted by Joleen.

 

Error log and follow up comments:

 

 

Exception in thread "main" org.collegeboard.ids.api.FileServiceException: Failed to get File Set for /1415/4075/data_delivery: An unexpected exception was thrown

       at org.collegeboard.ids.internal.web.WebFileService.listFiles(WebFileService.java:221)

       at org.collegeboard.ids.client.FileServiceClient.listFiles(FileServiceClient.java:139)

       at org.collegeboard.ids.client.FileServiceClient.listFiles(FileServiceClient.java:124)

       at org.collegeboard.ids.client.mailbox.MailboxClient.execute(Unknown Source)

       at org.collegeboard.ids.client.mailbox.Mailbox.main(Unknown Source) Caused by: org.collegeboard.ids.api.FileConnectionException: Failed to obtain an input stream from File /1415/4075/data_delivery: An I/O exception was thrown

       at org.collegeboard.ids.internal.web.WebFileService.getInputStream(WebFileService.java:378)

       at org.collegeboard.ids.internal.web.WebFileService.listFiles(WebFileService.java:203)

       ... 4 more

Caused by: java.net.SocketException: Connection reset

 

Please contact third party for information on any downtime on their end. A reset of the same component resulted in the same error. Initial error thrown around 5:33am this morning.

 

22 07:10:39-No Kill File found('/appworx/run/kill.13779958.01').

22 07:10:39-Parent: sleeping for 10 seconds.

Gudrun.

 

 

Data Delivery will be delayed this morning.  We are working to fix this issue and will let you know as soon as it is available. We apologize for any inconvenience.

Financial Aid Services

 

 

Candy got word that Data Delivery was working again. I restarted FAIDPROF_OD.FAIDIDSM and it finished.

Joleen.

 

 

Aborted Module Name: KFSXGLNA_FE.KFSXS059_01  

  Date:        Day:      Time:          Resolution:

07/22/14     Tue         20:48          Restarted by Dermot.

 

Error log and follow up comments:

 

 

old   3: utlpath                varchar2(255) := '&utl_path';

new   3: utlpath             varchar2(255) := '/orautl/kfsprd';

old   4: outfile1              varchar2(80)  := '&utl_file1';

new   4: outfile1            varchar2(80)  := 'KFSXGLNA_FE.KFSXS059_01.utl_file1';

old   5: outfile2              varchar2(80)  := '&utl_file2';

new   5: outfile2            varchar2(80)  := 'KFSXGLNA_FE.KFSXS059_01.utl_file2';

   from KRNS_PARM_T

       

ERROR at line 102:

ORA-06550: line 102, column 9:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 100, column 4:

PL/SQL: SQL Statement ignored

ORA-06550: line 109, column 10:

PL/SQL: ORA-00942: table or view does not exist

ORA-06550: line 107, column 5:

PL/SQL: SQL Statement ignored

ORA-06550: line 114, column 10:

PL/SQL: ORA-00942: table or view does not exist

 

 

 

Aborted Module Name:  HRMSS260.CHAIN_FINISH_01

  Date:        Day:      Time:          Resolution:

07/16/14     Wed         03:31         See follow up below from Gudrun.

 

Error log and follow up comments:

 

 

Just noticed …we can talk about this during Friday’s training forum time if need be…bring your suggestions. Here are mine:

 

 

A chain finish component aborted because the complete filespec of the file to be backed up was not specified in the first prompt rather only letter Y.

 

Unless other suggestions two options for fixing

 

Either

    

n  For a custom approach

a)        Enter the complete filespec in the first prompt of CHAIN_FINISH. Beware entry here only will not delete the file that got backed up.

 

n  For standard approach

Populate recognized CHAIN* backup and delete drivers with file or files to be backed up and deleted. CHAIN_FINISH script code detects these files.

 

b)        Create file {#workdat}/CHAIN_{chain_id}.CHAIN_FINISH_01.FILES_TO_BACKUP_DRIVER.DAT

c)        Create file {#workdat}/CHAIN_{chain_id}.CHAIN_FINISH_01.FILES_TO_DELETE_DRIVER.DAT

 

Examples of After conditions added:

 

{#logrunhost}; ls {#userfiles_Xinto}/ApplicationLoad/Input/CSU_Application_*  >  {#workdat}/{#1}.COLLECT_FILES.DAT;  cat {#workdat}/{#1}.COLLECT_FILES.DAT  >> {#files_to_delete_driver}; cat {#workdat}/{#1}.COLLECT_FILES.DAT  >> {#files_to_backup_driver}

Gudrun.

 

 

Aborted Module Name:  KFSXBCFO.KFSX_JAVA_01

  Date:        Day:      Time:          Resolution:

07/18/14     Fri          08:51           See follow up below.

 

Error log and follow up comments:

 

 

It happened again Guys!

 

2014-07-18 09:07:45,472 [main] INFO  org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient ::

               *******************************************************

               Started processing step centralizedFringeBatchStep of job KFSXBCFO.centralizedFringeBatchStep.13757803.13757806.01 for user kr

               Executing step: centralizedFringeBatchStep

               #### Log file name for this job step : /ais02/app/kfs/prd/logs/KFSXBCFO.centralizedFringeBatchStep.13757803.13757806.01-20140718-09-06-05-163.log

               *******************************************************

 

2014-07-18 09:07:45,475 [main] ERROR org.kuali.ext.mm.sys.batch.client.rmi.BatchJobRmiInvokerClient :: javax.xml.ws.soap.SOAPFaultException: OJB operation failed; nested exception is org.apache.ojb.broker.OptimisticLockException: Object has been modified by someone else: org.kuali.rice.coreservice.impl.parameter.ParameterBo@3dace634[namespaceCode=KFS-BC,componentCode=BudgetConstruction,name=RUN_CENTRALIZED_FRINGE_BATCH_JOB,applicationId=KUALI,value=N,description=Determines whether or not centralized fringe benefits should be calculated.,parameterTypeCode=VALID,evaluationOperatorCode=A,versionNumber=31,objectId=024C69BD-C258-186C-169A-05EED2012312,newCollectionRecord=false]

<#/ais02/job/prod/kfsx_java_ssh.ksh.127#> errtrap_ssh /ais02/job/prod/kfsx_java_ssh.ksh 1

Dermot.

 

Doesn’t want to allow a change to RUN_CENTRALIZED_FRINGE_BATCH_JOB parm. 

Found this log entry:

 

--------------------------------------

2014-07-18 10:18:38,075 [RMI TCP Connection(2)-129.82.111.80] INFO  org.kuali.kfs.module.bc.batch.service.impl.CentralizedFringeCalculationServiceImpl :: !!! calculating centralized fringe for fringe pool account 1301960

2014-07-18 10:18:38,433 [RMI TCP Connection(2)-129.82.111.80] INFO  org.apache.cxf.services.parameterRepositoryService.parameterServicePort.parameterService :: Outbound Message

Still looking beyond this point, but wonder if you can check the account 1301960?

 

Mike.

 

 

 

Aborted Module Name:  TSMCSTAT.SEND_MAIL_01

  Date:        Day:      Time:          Resolution:

07/26/14     Sat          13:30           Restarted by Rich.

 

 

Error log and follow up comments:

 

 

TRACKER NEWS REPORT: 07/28/2014-06:30

======================================================================

07/26/2014 13:52    RBLUMLEIN

I saw there was an abort for a appman job TSMCSTAT.SEND_MAIL_01. It aborted for a missing file /ais01/dat/work/prod/TSMCSTAT.TSMC_JH_STATUS_01.DAT

I didn't see anywhere in the job where is was created or removed and it wasn't in Tivoli. So I bet it was created from another job and used as a work file.

I emailed Joleen who looked into this and said there was a previous job that looks like is didn't copy the file from /appworx/out to /ais01/dat/work/prod properly.

I looked in /appworx/out and figured out the file that wasn't copied and then copied and renamed it manually.

 

Then I restarted the job which sent out the TSMC System Backups - Status email properly.

 

 

 

 

Aborted Module Name:   AROSDGLI.AROSS167_01

  Date:        Day:      Time:          Resolution:

07/29/14     Tue         01:38          Restarted by Joleen.

 

Error log and follow up comments:

 

ERROR at line 1:

ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

ORA-06512: at line 373

 

01:38:12 372         loop

01:38:12 373            fetch trx_cur into trx_rec;

01:38:12 374            exit when trx_cur%NOTFOUND;

 

It looks like something may have used all the TEMP space last night and caused this to fail.  The AROSS167 job only generates a report and can be restarted any time.

 

~Steven Dove

 

We are having a problem with an AR job aborting.

Here is the ORA error we are getting:

ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

I tried restarting once and the job aborted again with the tablespace TEMP error.

Joleen.

 

Try it again please.  I put a temporary program change into temp.

Rob.

 

AROSDGLI.AROSS169_01 has now been running for over an hour and a half. It usually finishes in a minute or two.

Does this process look okay?

David.

 

I don’t see any blocking sessions that would hold this up.  There are 2 high cpu processes running, both from Appworx.

One must be your sqlplus  job, started at 8:54, and the other is called RPEDISB, started at 10:34.

Your job is almost all cpu right now, not a lot of read/writes going on.

So the database looks ok, but the program is doing some processing of the data hence the high cpu.

Craig.

 

BFS did the first assessment for the fall semester last night.  Unfortunately, it’s going to take a while.

Rob.

 

 

 

Aborted Module Name:   AREGHINS_FA.SSH_SFTP_01

  Date:        Day:      Time:          Resolution:

08/01/14     Fri          18:02           Restarted by Joleen.

 

Error log and follow up comments:

 

 

I tried restarting after receiving this error below. The job aborted again with the same error.

Joleen.

# > ssh: connect to host sftp.renstudent.com port 22: Connection timed out # > Connection closed # > (255)

 

-----Original Message-----

From: jobprd@mailer.is.colostate.edu [mailto:jobprd@kebler.is.colostate.edu]

Sent: Friday, August 01, 2014 6:07 PM

To: IS DL: Alert APMX

Cc: 9705815577@tmomail.net

Subject: AWPROD APMXCHKS Abort Job Backlog Warning

 

Fri Aug 01 18:05:50 MDT 2014                                                                                                       Page 1

                                             Check Backlog for ABORTED jobs (so_status  202)                                            

Job                     Chain Id Start Date              Status Status Name Percentage Diff Observed RunTIme (Min) Average Run Time (Min)

----------------------- -------- ----------------------- ------ ----------- --------------- ---------------------- ----------------------

AREGHINS_FA.SSH_SFTP_01 13876206 08-01-2014 18:02:34 MDT    202 ABORTED             1520.59                    188                     12

 

Same error this morning again. External party needs to be contacted. Either Aries team leads and/or sysadmin should be able to help.

Gudrun.

 

Hi Lynne,

I hope you are the right person to contact. The Health insurance job (AREGHINS_HEALTH_INSURANCE) has aborted. We are unable to connect to the Ascension site. Could you contact them and find out if something has changed? Here is the error we are receiving.

 

# > ssh: connect to host sftp.renstudent.com port 22: Connection refused # > Connection closed # > (255)

Joleen.

 

 

 

Aborted Module Name: AREGSPWD.FTPS_NEW_PASS_01 

  Date:        Day:      Time:          Resolution:

07/31/14     Thu        10:05           Restarted by Joleen.

 

Error log and follow up comments:

 

Subject: AREGSPWD - Update DL Password - Verification Required

 

=================================================================

**** AREGSPWD_DL_CHG_PASSWORD - AREGSPWD IS SCHEDULED FOR TODAY

****

**** AREGSPWD_DL_CHG_PASSWORD will **NOT** be released until

**** confirmation email from Registrar

**** has been received by 'IS Support - Scheduling'

****

=================================================================

Registrar:

_  _ Verify that the NEW-$PUCSU@csdcp.state.co.us AppMan Login

     has been updated to the new password.

 

Password has been changed

Jerry.

 

The job aborted. I have attached the output.

 

# CMDOUT #  [530 PASS command failed]

# CMDOUT #  [530 You must first login with USER and PASS.

Joleen.

 

I called the help desk and had them manually set it to match what I entered in AppMan.

Jerry.

 

I should have mentioned when I responded that I deleted AREGSPWD.FTPS_NEW_PASS_01 and let AREGSPWD.DL_PASS_01 run to update the AppMan login. Could you put that in your abort book so everyone will know that is what needs to happen when Jerry calls the DOR and has them manually update the password? I would appreciate that.

Joleen.

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments:

 

 

 

 

 

Aborted Module Name:  

  Date:        Day:      Time:          Resolution:

 

Error log and follow up comments: