차이

문서의 선택한 두 판 사이의 차이를 보여줍니다.

차이 보기로 링크

양쪽 이전 판이전 판
다음 판
이전 판
tech:wget [2018/04/15 00:06] V_Ltech:wget [2023/07/23 00:04] (현재) – [사이트 전체 받기] V_L
줄 22: 줄 22:
 =====옵션===== =====옵션=====
  
-  wget -i mylist.txt 
  
-목록의 파일을 가져온다 
  
-Basic arguments 
-These are the basic arguments needed to perform the recursive download. 
  
---recursive+<file> 
 +  --recursive 
 +</file>
 Tells wget to recursively download pages, starting from the specified URL. Tells wget to recursively download pages, starting from the specified URL.
---level=1+<file> 
 +  --level=1 
 +</file>
 Tells wget to stop after one level of recursion. This can be changed to download more deeply, or set to 0 that means “no limit” Tells wget to stop after one level of recursion. This can be changed to download more deeply, or set to 0 that means “no limit”
---no-clobber+<file> 
 +  --no-clobber 
 +</file>
 Skip downloads that would download to existing files Skip downloads that would download to existing files
---page-requisites+<file> 
 +  --page-requisites 
 +</file>
 Tells wget to download all the resources (images, css, javascript, ...) that are needed for the page to work. Tells wget to download all the resources (images, css, javascript, ...) that are needed for the page to work.
---html-extension+<file> 
 +  --html-extension 
 +</file>
 Adds ”.html” extension to downloaded files, with the double purpose of making the browser recognize them as html files and solving naming conflicts for “generated” URLs, when there are no directories with “index.html” but just a framework that responds dynamically with generated pages. Adds ”.html” extension to downloaded files, with the double purpose of making the browser recognize them as html files and solving naming conflicts for “generated” URLs, when there are no directories with “index.html” but just a framework that responds dynamically with generated pages.
---convert-links+<file> 
 +  --convert-links 
 +</file>
 After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc. After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc.
---no-parent+<file> 
 +  --no-parent 
 +</file>
 Do not ever ascend to the parent directory when retrieving recursively. Do not ever ascend to the parent directory when retrieving recursively.
---domains=www.example.com+<file> 
 +  --domains=www.example.com 
 +</file>
 Set domains to be followed. DOMAIN-LIST is a comma-separated list of domains. Set domains to be followed. DOMAIN-LIST is a comma-separated list of domains.
 +
 Avoiding imposed download limits Avoiding imposed download limits
 Many web servers tend to limit the pages a user can download in a given amount of time, or the user-agents that can access given pages, etc. To avoid such limits, some extra options may be added. Many web servers tend to limit the pages a user can download in a given amount of time, or the user-agents that can access given pages, etc. To avoid such limits, some extra options may be added.
  
--U "Mozilla/5.0 (X11; U; Linux; en-US; rv:1.9.1.16) Gecko/20110929 Firefox/3.5.16"+<file> 
 +  -U "Mozilla/5.0 (X11; U; Linux; en-US; rv:1.9.1.16) Gecko/20110929 Firefox/3.5.16" 
 +</file>
 Tells wget to use a fake user-agent, to emulate the one of a web browser (in this case, Firefox 3.5 on Linux) Tells wget to use a fake user-agent, to emulate the one of a web browser (in this case, Firefox 3.5 on Linux)
---wait=3+<file> 
 +  --wait=3 
 +</file>
 Tells wget to wait at least 3 seconds between retrievals. Tells wget to wait at least 3 seconds between retrievals.
---random-wait+<file> 
 +  --random-wait 
 +</file>
 Tells wget to wait a random time between 0 and double the value specified with –wait between requests. Tells wget to wait a random time between 0 and double the value specified with –wait between requests.
  
--P prefix+<file> 
 +  -P prefix
 --directory-prefix=prefix --directory-prefix=prefix
-           Set directory prefix to prefix.  The directory prefix is the +</file> 
-           directory where all other files and sub-directories will be + 
-           saved to, i.e. the top of the retrieval tree.  The default +Set directory prefix to prefix.  The directory prefix is the 
-           is . (the current directory).+directory where all other files and sub-directories will be 
 +saved to, i.e. the top of the retrieval tree.  The default 
 +is . (the current directory).
 =====예제===== =====예제=====
  ====사이트 전체 받기====  ====사이트 전체 받기====
줄 81: 줄 103:
   --post-data=string   --post-data=string
      
 +  
 +<file>
 +   --recursive: download the entire Web site.
 +   --domains website.org: don't follow links outside website.org.
 +   --no-parent: don't follow links outside the directory tutorials/html/.
 +   --page-requisites: get all the elements that compose the page (images, CSS and so on).
 +   --html-extension: save files with the .html extension.
 +   --convert-links: convert links so that they work locally, off-line.
 +   --restrict-file-names=windows: modify filenames so that they will work in Windows as well.
 +   --no-clobber: don't overwrite any existing files (used in case the download is interrupted and resumed).
 +</file> 
 ====사진 받기==== ====사진 받기====
  
-  wget   -r   -np   --reject   "*.txt"   http://192.168.0.100/images +  wget -r -np --reject "*.txt" http://192.168.0.100/images
- +
--r은 --recursive를 줄인 것이구요. 기본적으로 wget에서는 5단계까지 하위폴더의 모든 파일을 다운로드 함. +
- +
- -l(소문자 엘) 옵션을 사용하면 depth을 더 들어가기  된다고 하다. l(엘)은 level을 뜻하는 것이구요. +
- +
--np라는 옵션은 no-parent를 줄여서 쓴 것으로, recursive 옵션을 주고 실행할 때 부모 디렉토리의 파일을 다운로드 하지 말라는 뜻인거 같다. +
-  +
-마지막으로 이미지와 텍스트가 포함된 폴더라고 가정하면  +
-reject 옵션으로 이미지만 다운로드 할 수 있다.+
  
-[[http://ngee.tistory.com/376|출처 ngee]]+  * -r은 --recursive를 줄인 것.  
 +  * -l (소문자 엘) 옵션을 사용하면 하위폴더의 단계를 정한다. 기본적으로 wget에서는 5단계까지 하위폴더의 모든 파일을 다운로드 함. 
 +  * -np라는 옵션은 no-parent. recursive 옵션을 주고 실행할 때 부모 디렉토리의 파일을 다운로드 하지 말라는 뜻. 
 +  * 이미지와 텍스트가 포함된 폴더라고 가정하면 --reject 옵션으로 txt 파일을 제외한다. [[http://ngee.tistory.com/376|출처 ngee]]
  
 ====여러 파일 받기==== ====여러 파일 받기====