Comments (2)
Hello, what is the content of the tags for each of the 120 categories in the dataset?
The detail of the categories is as below:
|-- 163.com
|-- 51.la
|-- 51cto.com
|-- acm.org
|-- adobe.com
|-- alibaba.com
|-- alicdn.com
|-- alipay.com
|-- amap.com
|-- amazonaws.com
|-- ampproject.org
|-- apple.com
|-- arxiv.org
|-- asus.com
|-- atlassian.net
|-- azureedge.net
|-- baidu.com
|-- bilibili.com
|-- biligame.com
|-- booking.com
|-- chia.net
|-- chinatax.gov.cn
|-- cisco.com
|-- cloudflare.com
|-- cloudfront.net
|-- cnblogs.com
|-- codepen.io
|-- crazyegg.com
|-- criteo.com
|-- ctrip.com
|-- dailymotion.com
|-- deepl.com
|-- digitaloceanspaces.com
|-- duckduckgo.com
|-- eastday.com
|-- eastmoney.com
|-- elsevier.com
|-- facebook.com
|-- feishu.cn
|-- ggpht.com
|-- github.com
|-- gitlab.com
|-- gmail.com
|-- goat.com
|-- google.com
|-- grammarly.com
|-- gravatar.com
|-- guancha.cn
|-- huanqiu.com
|-- huawei.com
|-- hubspot.com
|-- huya.com
|-- ibm.com
|-- icloud.com
|-- ieee.org
|-- instagram.com
|-- iqiyi.com
|-- jb51.net
|-- jd.com
|-- kugou.com
|-- leetcode-cn.com
|-- media.net
|-- mi.com
|-- microsoft.com
|-- mozilla.org
|-- msn.com
|-- naver.com
|-- netflix.com
|-- nike.com
|-- notion.so
|-- nvidia.com
|-- office.net
|-- onlinedown.net
|-- opera.com
|-- oracle.com
|-- outbrain.com
|-- overleaf.com
|-- paypal.com
|-- pinduoduo.com
|-- python.org
|-- qcloud.com
|-- qq.com
|-- researchgate.net
|-- runoob.com
|-- sciencedirect.com
|-- semanticscholar.org
|-- sina.com.cn
|-- smzdm.com
|-- snapchat.com
|-- sohu.com
|-- spring.io
|-- springer.com
|-- squarespace.com
|-- statcounter.com
|-- steampowered.com
|-- t.co
|-- taboola.com
|-- teads.tv
|-- thepaper.cn
|-- tiktok.com
|-- toutiao.com
|-- twimg.com
|-- twitter.com
|-- unity3d.com
|-- v2ex.com
|-- vivo.com.cn
|-- vk.com
|-- vmware.com
|-- walmart.com
|-- weibo.com
|-- wikimedia.org
|-- wikipedia.org
|-- wp.com
|-- xiaomi.com
|-- ximalaya.com
|-- yahoo.com
|-- yandex.ru
|-- youtube.com
|-- yy.com
`-- zhihu.com
from et-bert.
thank you!
from et-bert.
Related Issues (20)
- how long to train? HOT 1
- Data labeling? HOT 2
- 关于微调后模型泛化能力的问题
- CrossPlatForm数据集的问题 HOT 1
- 关于直接下载您处理好的cstnet-tls1.3数据集的疑问 HOT 4
- 关于vocab_process的问题
- Have you removed bidirectional IP and port information and protocol information to reduce the impact of packet headers? (e.g. remove 5-tuples) HOT 1
- 为什么采用bi-gram的形式,而不用tri-gram的形式 HOT 1
- How to generate .tsv files HOT 2
- 关于用于预训练的语料问题? HOT 2
- dataprogress HOT 1
- 有关从pcap生成tsv文件遇到的问题 HOT 1
- 请问ET-BERT对于纯数据流能进行识别和分类吗
- vocab_process/main.py 中缺少变量的全局定义
- uer utils中的misc.py问题
- Error in data processing in VPN dataset
- bugfix in main/data_process/dataset_cleaning.py
- ET-BERT corpora lose
- 语料库生成问题
- 如何将流量转为BURST
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from et-bert.