I Can Do I.T.: June 2017

Tuesday 27 June 2017

[Encoding] 정리

KSC5601 (Charset) : 한글 완성형 표준(한글 2,350 표현)

KSC5656 (Charset) : 영문자에 대한 표준. 기존 ASCII Code에서 역슬래쉬를 원화 표시로 대체

ECU-KR (Encoding) : Bell Laboratories에서 유닉스 상에서 영문자 이외의 문자를 지원하기 위해 제안한 확장 유닉스 코드(Extend UNIX Code)중 한글 인코딩 방식.

영문은 KSC5636로 처리하고 한글은 KSC5601로 처리.

아래는 참조 내용

http://nuli.navercorp.com/sharing/blog/post/1079940

완성형(KSC5601-87)의 표준 채택 이유

ISO-2022(2바이트 이상의 문자 부호를 사용할 때 지켜야 하는 확장 방법에 관한 국제 표준)에 따르고 있으므로 외국의 네트워크나 SW 사용에 유리하다.
현재의 한글 사용 실태를 조사해 보았을 때 2,350자의 한글만으로도 충분히 모든 표현이 가능하다.
정렬 작업에 있어서 한글 변환 테이블을 통해서 가능함으로 크게 문제가 될 것이 없다.

완성형(KSC5601-87)에 대한 반론

한글 창제의 원리 초, 중, 종성의 구별이 없는 단순한 부호에 불과하다.
모든 한글을 표기 할 수 없으므로 문학 작품을 집필하거나 신조어를 표현할 때 문제가 된다.
우리 언어의 영역이 제한 받는 결과를 가져온다.
한글이라 붙여진 코드에 한글보다 한자가 더욱 많다. 또한 필요 없는 특수 문자가 너무 많고 외국의 문자집합까지 포함하고 있어 오히려 한글 사용 영역이 줄어들었다.
음소의 분석이 어려우므로 형태소 해석이 불가능하여, 차후 음성 인식에서 사용할 수 없는 부호이다.
ISO-2022를 따르고 있다고 하지만 ISO의 인증은 받지 못한 부호이다.
한글 오토마타(automata) 구현에 있어서 한글 키 입력에 의해 조합형 코드가 만들어지면 이를 테이블을 통해 완성형 코드로 변환하여 호출하므로 부담이 된다.

KSC5601이 표준으로 제정되자 기업들은 앞다투어 완성형 한글만을 탑재하기 시작했으나 이로 인해 워드프로세서에서 한글 표기가 제대로 되지 않는 문제(고어나 독음)가 발생하여 워드프로세스의 내부에서 코드를 조합하여 표현이 가능하도록 하는 등의 기형적인 구현사례가 늘어나게 됩니다.

이후 KSC5601-87에 1930자의 한글을 추가한 KSC5657을 발표하였으나 여전히 근본적인 문제가 해결되지 않은 부호로 거의 사용되지 않았고, 결국, 정부는 1992년 기존 KSC5601 완성형과 함께 조합형 한글을 함께 수용할 수 있는 KSC5601-92를 표준으로 제정하여 현재까지 사용하고 있습니다.

유니코드(Unicode) 프로젝트

유니코드는 전세계의 모든 문자를 동일하게 표현하기 위한 산업표준으로 유니코드협회(Unicode Consortium)가 제정하며, 유니코드에는 ISO-10646에 포함된 문자집합, 문자 부호화와 문자를 표시하기 위한 복호화 알고리즘이 포함되어 있습니다.

ISO-10646은 문자 표시에 관한 국제 표준으로 초기 ISO-10646과 유니코드는 서로 다른 독자적인 표준이었으나. ISO-10646-1이 제정되면서, ISO실행위원회와 유니코드컨소시엄의 협의로 문자 표시 방법이 통합되어 현재의 국제 표준은 유니코드라고 할 수 있습니다.

(ISO-10646에는 KS X 1001(KSC5601), EUC-KR, ISO-2022-KR의 한글 문자 부호가 포함되어 있다.)

UTF (Unicode Transformation Format)

UTF는 유니코드 형태의 문자를 변환하기 위한 공식이다. 유니코드는 4byte 구성되어 있기 때문에 사용하는 코드 범위에 따라서 1~4byte로 변환이 가능하게 된다. UTF-7, UTF-8, UTF-16BE, UTF-16LE등의 종류가 있다.

EUC

euc는 extend unix code의 약자로 유닉스에서 영어를 제외한 문자를 표시하기 위한 확장 부호를 의미합니다. 그 중 euc-kr은 한글 표현을 위한 문자 인코딩인데, 영문은 KSC5636(ASCII와 동일하나 역슬래쉬를 원표시로 대체)으로 처리하고 한글은 KSC5601로 처리합니다. 과거 euc-kr은 KSC5601-87의 완성형 한글이었으나 현재의 euc-kr은 KSC5601-92로 조합형 한글까지 사용 가능합니다.

CP949

마이크로소프트에서 사용하는 한글 문자의 부호표입니다. 본래 code page는 IBM에서 최초 고안하였으나 MS-window에서 한글 표현을 위해 채용하면서 MS949로 불리우기도 합니다. 처음 CP949는 KSC5601에 표현된 2350자만을 제공하였으나 KSC5601-92가 제정되면서 조합형 한글에 대한 부호표도 추가되어 제공되고 있습니다.

KSC5601 vs EUC-KR vs CP949

KSC5601은 완성형과 조합형의 모든 한글 문자의 표현이 가능한 한글 문자 부호 표준이며 euc-kr과 CP949는 모두 이 KSC5601을 기본으로 한 문자 부호입니다. 유닉스계열의 한글 문자 부호인 euc-kr에서는 KSC5601을 그대로 수용하고 있으며, 윈도우계열 한글 문자 부호인 CP949(MS949)는 완성형 한글의 형태를 취하고 있으나 KSC5601에 의해 조합형으로 만들어지는 한글의 코드까지도 제공하고 있으므로 두 문자 부호의 인코딩 방식은 달라도 같은 코드를 만들어 내게 되어 두 문자 부호는 서로 호환됩니다. 단, java환경에서는 euc-kr이 KSC5601-87로 사용되어 CP949의 확장 완성형과 호환되지 않을 수도 있으니 주의해야 합니다.

KSC5601 vs Unicode

유니코드에는 KSC5601의 문자 집합이 포함되어 있지만, 4byte의 유니코드의 어느 범주에 속하느냐에 따라 그리고 어떤 변환식을 사용하느냐에 따라 부호의 값이 달라지므로 KSC5601을 그대로 사용하는 euc-kr(CP949)와 유니코드는 서로 호환되지 않습니다.

결론

현재의 한글 표준 부호는 완성형과 조합형의 구분이 없습니다, euc-kr로 선언된 웹페이지에서든, MS949를 사용하는 윈도우에서든 똠, 꿿, 휗 휅 같은 문자들이 모두 표시가 되기 때문에 한글을 더 쉽게 사용할 수 있어, 국제표준인 유니코드와 유닉스계열의 euc-kr만 잘 구분하여 사용할 수 있으면 인코딩 때문에 한글이 깨지는 일은 없을 것 입니다.

[Oracle] Oracle JDBC Charset 변환 과정

Database Charset이 US77ASCII 나 WE8ISO8859P1 일 경우, JDBC 드라이버가 데이터를 무조건 UTF16(Java에서 String 인코딩) 으로 변화하여 클라이언트로 전달.

Database Charset이 US77ASCII 나 WE8ISO8859P1 가 아닐 경우, JDBC 드라이버가 데이터를 UTF8로 변환 후, UTF16로 다시 변환하여 클라이언트로 전달.(Server-side일 경우, UTF8로 변환하는 과정 생략)

* reference : https://docs.oracle.com/cd/B10501_01/java.920/a96654/advanc.htm

Thursday 22 June 2017

[Oracle] LONG data type

Long type in Oracle is not bigint.

LOB Character Datatypes

The LOB datatypes for character data are CLOB and NCLOB. They can store up to 8 terabytes of character data (CLOB) or national character set data (NCLOB).

LONG Datatype

Note:

Do not create tables with LONG columns. Use LOB columns (CLOB, NCLOB) instead. LONG columns are supported only for backward compatibility.

Oracle also recommends that you convert existing LONG columns to LOB columns. LOB columns are subject to far fewer restrictions than LONG columns. Further, LOB functionality is enhanced in every release, whereas LONG functionality has been static for several releases.

Columns defined as LONG can store variable-length character data containing up to 2 gigabytes of information. LONG data is text data that is to be appropriately converted when moving among different systems.

LONG datatype columns are used in the data dictionary to store the text of view definitions. You can use LONG columns in SELECT lists, SET clauses of UPDATE statements, and VALUES clauses of INSERT statements.

reference : https://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#CNCPT1830

Wednesday 21 June 2017

[Java] How to find running java process in java

/**
 *********************************************************
 * FindRunningJavaProcess.java
 *********************************************************
 * @version 1.0.00 2017. 6. 21. dorbae(dorbae.io@gmail.com) Initialize
 * @since 2017. 6. 21.
 * @author dorbae(dorbae.io@gmail.com) Initialize
 */
package pe.dorbae.test;

import java.util.Iterator;
import java.util.Set;

import sun.jvmstat.monitor.MonitoredVm;
import sun.jvmstat.monitor.MonitoredVmUtil;
import sun.jvmstat.monitor.VmIdentifier;

/**
 * @author Administrator
 *
 */
public class FindRunningJavaProcess {

 /**
  *********************************************************
  * @param args
  *********************************************************
  * @version 1.0.00 2017. 6. 21. dorbae(dorbae.io@gmail.com) Initialize
  * @since 2017. 6. 21.
  * @author dorbae(dorbae.io@gmail.com) Initialize
  */
 public static void main(String[] args) {
  
     MonitoredVm vm = null;
     VmIdentifier id = null;
     int lvmId = -1;
     String className = null;
     String arg = null;
     String vmArg = null;
     String vmVersion = null;
     String commandLine = null;
         
  try {
   
   sun.jvmstat.monitor.HostIdentifier hostId = new sun.jvmstat.monitor.HostIdentifier( "localhost");
   sun.jvmstat.monitor.MonitoredHost monitoredHost = sun.jvmstat.monitor.MonitoredHost.getMonitoredHost( hostId);
   
    // Get Active VMs
   Set< Integer> jvms = monitoredHost.activeVms();
   if ( jvms == null)
    System.out.println( "ActiveVM is null.");
   else if ( jvms.size() < 1)
    System.out.println( "ActiveVM's size is 0.");
  
   
   for ( Iterator< Integer> j = jvms.iterator(); j.hasNext(); /* empty */ ) {
             lvmId = j.next().intValue();
         
             String vmidString = "//" + lvmId;
         
             try {
              id = new VmIdentifier( vmidString);
             
              try {
               vm = monitoredHost.getMonitoredVm( id, 0);
               
               if ( vm == null) {
                   System.out.println( "MonitoredVm is null).");
                   continue;
                  }
               
               // Get Java class name. true : full path including package, false : only class name
               className = MonitoredVmUtil.mainClass( vm, true);
               // Program arguments
               arg = MonitoredVmUtil.mainArgs( vm);
               // VM options
               vmArg = MonitoredVmUtil.jvmArgs( vm);
               // JVM version
               vmVersion = MonitoredVmUtil.vmVersion( vm);
               // Command line
               commandLine = MonitoredVmUtil.commandLine( vm);

               System.out.printf( "CLASS=[%s], ARG=[%s], VMARG=[%s], VMVERSION=[%s], COMMANDLINE=[%s]\n"
                 , className
                 , arg
                 , vmArg
                 , vmVersion
                 , commandLine
                 );
               
              } catch ( Exception e) {
               System.out.println( "Failed to get MonitoredVm.");
               e.printStackTrace();
               
              }
              
             } catch ( Exception e) {
              e.printStackTrace();
              
             } finally {
              if ( vm != null)
               try {
                monitoredHost.detach( vm);
               } catch ( Exception e) {
                e.printStackTrace();
                
               }
             }
             
         } // End of for
   
  } catch ( Exception e) {
   e.printStackTrace();
   
        
  }
 }
        
}

Monday 19 June 2017

[Oracle] Query for looking index

SELECT A.UNIQUENESS
     , A.INDEX_TYPE
     , A.TABLE_OWNER
     , A.TABLE_NAME
     , B.*
  FROM ALL_INDEXES A,
       ALL_IND_COLUMNS B
 WHERE A.INDEX_NAME = B.INDEX_NAME
   AND A.TABLE_NAME=UPPER('TAGPA201_ENC'); 
;

Sunday 18 June 2017

[Oracle] Oracle JDBC DatabaseMetaData.getIndexInfo() occurs analyze table compute statistics

When I called DatabaseMetaData.getIndexInfo() for checking index, Oracle database was very slow. So, I wonder why oracle is getting slow.

ResultSet getIndexInfo(String catalog, String schema, String table, boolean unique, boolean approximate) throws SQLException

Retrieves a description of the given table’s indices and statistics. They are ordered by NON_UNIQUE, TYPE, INDEX_NAME, and ORDINAL_POSITION.

Each index column description has the following columns: (snipped)

Parameters:
catalog – a catalog name; must match the catalog name as it is stored in this database; “” retrieves those without a catalog; null means that the catalog name should not be used to narrow the search
schema – a schema name; must match the schema name as it is stored in this database; “” retrieves those without a schema; null means that the schema name should not be used to narrow the search
table – a table name; must match the table name as it is stored in this database
unique – when true, return only indices for unique values; when false, return indices regardless of whether unique or not
approximate – when true, result is allowed to reflect approximate or out of data values; when false, results are requested to be accurate

When this method is called, Oracle JDBC driver execute analyze table statistics.
If approximate is true, execute analyze table schema.table estimate statistics.
On the other hand, if approximate is false, analyze table schema.table compute statistics.

Estimate occurs less overhead than compute. But analyze table makes database slow-down

reference
https://timurakhmadeev.wordpress.com/2010/01/11/databasemetadatagetindexinfo/

Tuesday 13 June 2017

[RCP] java.lang.IllegalStateException: Cannot change the location once it is set.

When I set workspace in Activator.start( BundleContext), a exception occurs like this.

A solution solving this problem is add - data @noDefault runtime option.

Monday 12 June 2017

[Oracle] How to set different SID in Oracle JDBC Failover URL

jdbc:oracle:thin:@(DESCRIPTION_LIST=
(FAILOVER=ON)(LOAD_BALANCE=OFF)
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=[DB1_HOST])(PORT=[DB1_PORT]))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=[DB1_SID])))
(DESCRIPTION=
(ADDRESS=(PROTOCOL=TCP)(HOST=[DB2_HOST])(PORT=[DB2_PORT]))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=[DB2_SID)))
)

I Can Do I.T.