Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Contribute to GitLab
Sign in
Toggle navigation
P
poc-api
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
poc
poc-api
Commits
ff89faec
Commit
ff89faec
authored
May 21, 2025
by
Roger Wu
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'release' of
ssh://gitlab.gsstcloud.com:10022/poc/poc-api
into release
parents
4da4ae97
2eb8f581
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
30 changed files
with
1075 additions
and
212 deletions
+1075
-212
AgentApplicationServiceImpl.java
...plication/aggregate/impl/AgentApplicationServiceImpl.java
+52
-112
AgentApplicationIndexPluginQuery.sql
...nt_application/query/AgentApplicationIndexPluginQuery.sql
+9
-0
AgentApplicationIndexPluginQueryCondition.java
...tion/query/AgentApplicationIndexPluginQueryCondition.java
+23
-0
AgentApplicationIndexPluginQueryItem.java
...plication/query/AgentApplicationIndexPluginQueryItem.java
+74
-0
BizAgentApplicationIndexPluginService.java
...cation/service/BizAgentApplicationIndexPluginService.java
+18
-0
BizAgentApplicationIndexPluginServiceImpl.java
...rvice/impl/BizAgentApplicationIndexPluginServiceImpl.java
+22
-0
AgentApplicationTools.java
...om/poc/agent_application/utils/AgentApplicationTools.java
+3
-1
BosConfigServiceImpl.java
.../cn/com/poc/common/service/impl/BosConfigServiceImpl.java
+2
-1
AgentApplicationExposeService.java
...m/poc/expose/aggregate/AgentApplicationExposeService.java
+9
-0
AgentApplicationExposeServiceImpl.java
...ose/aggregate/impl/AgentApplicationExposeServiceImpl.java
+13
-0
IndexPluginDto.java
src/main/java/cn/com/poc/expose/dto/IndexPluginDto.java
+49
-0
AgentApplicationRest.java
...ain/java/cn/com/poc/expose/rest/AgentApplicationRest.java
+8
-0
AgentApplicationRestImpl.java
...cn/com/poc/expose/rest/impl/AgentApplicationRestImpl.java
+17
-0
ContentReportRestImpl.java
...va/cn/com/poc/expose/rest/impl/ContentReportRestImpl.java
+47
-8
LargeModelFunctionEnum.java
...y/resource/demand/ai/function/LargeModelFunctionEnum.java
+6
-0
DocumentUnderstandIngFunction.java
...document_understanding/DocumentUnderstandIngFunction.java
+22
-10
ContractExtractionFunction.java
...nd/ai/function/extraction/ContractExtractionFunction.java
+121
-0
Config.java
...resource/demand/ai/function/extraction/entity/Config.java
+20
-0
KeyInfo.java
...esource/demand/ai/function/extraction/entity/KeyInfo.java
+64
-0
RequestData.java
...rce/demand/ai/function/extraction/entity/RequestData.java
+23
-0
ImageOCRFunction.java
...source/demand/ai/function/image_ocr/ImageOCRFunction.java
+6
-0
LongDocumentReaderFunction.java
...tion/long_document_reader/LongDocumentReaderFunction.java
+116
-0
PdfToMDFunction.java
...ce/demand/ai/function/text_in_pdf2md/PdfToMDFunction.java
+12
-3
OCRClient.java
...urce/demand/ai/function/text_in_pdf2md/api/OCRClient.java
+0
-72
TextInClient.java
.../com/poc/thirdparty/resource/textin/api/TextInClient.java
+195
-0
PdfToMDResponse.java
...oc/thirdparty/resource/textin/entity/PdfToMDResponse.java
+1
-1
PdfToMDResult.java
.../poc/thirdparty/resource/textin/entity/PdfToMDResult.java
+8
-0
ContentReportTest.java
src/test/java/cn/com/poc/expose/ContentReportTest.java
+83
-0
LongDocumentReaderFunctionTest.java
...ce/demand/ai/function/LongDocumentReaderFunctionTest.java
+35
-0
PdfToMdFunctionTest.java
...arty/resource/demand/ai/function/PdfToMdFunctionTest.java
+17
-4
No files found.
src/main/java/cn/com/poc/agent_application/aggregate/impl/AgentApplicationServiceImpl.java
View file @
ff89faec
This diff is collapsed.
Click to expand it.
src/main/java/cn/com/poc/agent_application/query/AgentApplicationIndexPluginQuery.sql
0 → 100644
View file @
ff89faec
select
bgaa
.
agent_id
,
bgaa
.
agent_avatar
,
bgaa
.
agent_title
,
bgaa
.
agent_desc
from
biz_agent_application_publish
bgaa
join
biz_agent_application_index_plugin
baaip
on
bgaa
.
agent_id
=
baaip
.
agent_id
and
baaip
.
is_deleted
=
'N'
where
bgaa
.
is_deleted
=
'N'
<<
and
LOCATE
(:
search
,
bgaa
.
agent_title
)
>>
\ No newline at end of file
src/main/java/cn/com/poc/agent_application/query/AgentApplicationIndexPluginQueryCondition.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
agent_application
.
query
;
import
java.io.Serializable
;
/**
* Query Condition class for AgentApplicationIndexPluginQuery
*/
public
class
AgentApplicationIndexPluginQueryCondition
implements
Serializable
{
private
static
final
long
serialVersionUID
=
1L
;
private
java
.
lang
.
String
search
;
public
java
.
lang
.
String
getSearch
(){
return
this
.
search
;
}
public
void
setSearch
(
java
.
lang
.
String
search
){
this
.
search
=
search
;
}
}
\ No newline at end of file
src/main/java/cn/com/poc/agent_application/query/AgentApplicationIndexPluginQueryItem.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
agent_application
.
query
;
import
java.io.Serializable
;
import
javax.persistence.Column
;
import
javax.persistence.Entity
;
import
cn.com.yict.framemax.data.model.BaseItemClass
;
/**
* Query Item class for AgentApplicationIndexPluginQuery
*/
@Entity
public
class
AgentApplicationIndexPluginQueryItem
extends
BaseItemClass
implements
Serializable
{
private
static
final
long
serialVersionUID
=
1L
;
/** agent_id
*agent_id
*/
private
java
.
lang
.
String
agentId
;
@Column
(
name
=
"agent_id"
)
public
java
.
lang
.
String
getAgentId
(){
return
this
.
agentId
;
}
public
void
setAgentId
(
java
.
lang
.
String
agentId
){
this
.
agentId
=
agentId
;
}
/** agent_avatar
*agent_avatar
*/
private
java
.
lang
.
String
agentAvatar
;
@Column
(
name
=
"agent_avatar"
)
public
java
.
lang
.
String
getAgentAvatar
(){
return
this
.
agentAvatar
;
}
public
void
setAgentAvatar
(
java
.
lang
.
String
agentAvatar
){
this
.
agentAvatar
=
agentAvatar
;
}
/** agent_title
*agent_title
*/
private
java
.
lang
.
String
agentTitle
;
@Column
(
name
=
"agent_title"
)
public
java
.
lang
.
String
getAgentTitle
(){
return
this
.
agentTitle
;
}
public
void
setAgentTitle
(
java
.
lang
.
String
agentTitle
){
this
.
agentTitle
=
agentTitle
;
}
/** agent_desc
*agent_desc
*/
private
java
.
lang
.
String
agentDesc
;
@Column
(
name
=
"agent_desc"
)
public
java
.
lang
.
String
getAgentDesc
(){
return
this
.
agentDesc
;
}
public
void
setAgentDesc
(
java
.
lang
.
String
agentDesc
){
this
.
agentDesc
=
agentDesc
;
}
}
\ No newline at end of file
src/main/java/cn/com/poc/agent_application/service/BizAgentApplicationIndexPluginService.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
agent_application
.
service
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryCondition
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryItem
;
import
cn.com.yict.framemax.core.service.BaseService
;
import
java.util.List
;
/**
* @author alex.yao
* @date 2025/5/14
*/
public
interface
BizAgentApplicationIndexPluginService
extends
BaseService
{
List
<
AgentApplicationIndexPluginQueryItem
>
agentApplicationIndexPluginQuery
(
AgentApplicationIndexPluginQueryCondition
condition
);
}
src/main/java/cn/com/poc/agent_application/service/impl/BizAgentApplicationIndexPluginServiceImpl.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
agent_application
.
service
.
impl
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryCondition
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryItem
;
import
cn.com.poc.agent_application.service.BizAgentApplicationIndexPluginService
;
import
cn.com.yict.framemax.core.service.impl.BaseServiceImpl
;
import
org.springframework.stereotype.Service
;
import
java.util.List
;
/**
* @author alex.yao
* @date 2025/5/14
*/
@Service
public
class
BizAgentApplicationIndexPluginServiceImpl
extends
BaseServiceImpl
implements
BizAgentApplicationIndexPluginService
{
@Override
public
List
<
AgentApplicationIndexPluginQueryItem
>
agentApplicationIndexPluginQuery
(
AgentApplicationIndexPluginQueryCondition
condition
)
{
return
this
.
sqlDao
.
query
(
condition
,
AgentApplicationIndexPluginQueryItem
.
class
);
}
}
src/main/java/cn/com/poc/agent_application/utils/AgentApplicationTools.java
View file @
ff89faec
...
@@ -174,7 +174,9 @@ public class AgentApplicationTools {
...
@@ -174,7 +174,9 @@ public class AgentApplicationTools {
}
}
query
=
"用户输入:"
+
query
+
"\n"
;
query
=
"用户输入:"
+
query
+
"\n"
;
if
(
CollectionUtils
.
isNotEmpty
(
fileUrls
))
{
if
(
CollectionUtils
.
isNotEmpty
(
fileUrls
))
{
query
=
query
+
"用户上传文件地址:"
+
JsonUtils
.
serialize
(
fileUrls
)
+
"\n"
+
"文件格式:"
+
fileUrls
.
get
(
0
).
substring
(
fileUrls
.
get
(
0
).
lastIndexOf
(
"."
))
+
"\n"
;
query
=
query
+
"用户上传文件地址:"
+
JsonUtils
.
serialize
(
fileUrls
)
+
"\n"
+
"文件格式:"
+
fileUrls
.
get
(
0
).
substring
(
fileUrls
.
get
(
0
).
lastIndexOf
(
"."
))
+
"\n"
;
}
else
{
query
=
query
+
"用户上传文件地址:无\n"
+
"文件格式:无\n"
;
}
}
List
<
Tool
>
deductionTools
=
new
ArrayList
<>();
List
<
Tool
>
deductionTools
=
new
ArrayList
<>();
...
...
src/main/java/cn/com/poc/common/service/impl/BosConfigServiceImpl.java
View file @
ff89faec
...
@@ -23,6 +23,7 @@ import java.io.IOException;
...
@@ -23,6 +23,7 @@ import java.io.IOException;
import
java.io.InputStream
;
import
java.io.InputStream
;
import
java.net.HttpURLConnection
;
import
java.net.HttpURLConnection
;
import
java.net.URL
;
import
java.net.URL
;
import
java.nio.charset.StandardCharsets
;
import
java.util.Base64
;
import
java.util.Base64
;
import
java.util.Date
;
import
java.util.Date
;
import
java.util.Optional
;
import
java.util.Optional
;
...
@@ -91,7 +92,7 @@ public class BosConfigServiceImpl implements BosConfigService {
...
@@ -91,7 +92,7 @@ public class BosConfigServiceImpl implements BosConfigService {
meta
.
setContentDisposition
(
"attachment; filename="
+
FILE_NAME
);
meta
.
setContentDisposition
(
"attachment; filename="
+
FILE_NAME
);
}
}
// 设置内容被下载时的编码格式。
// 设置内容被下载时的编码格式。
meta
.
setContentEncoding
(
"utf-8"
);
meta
.
setContentEncoding
(
StandardCharsets
.
UTF_8
.
displayName
()
);
meta
.
setContentLength
(
inputStream
.
available
());
meta
.
setContentLength
(
inputStream
.
available
());
// 设置上传目录
// 设置上传目录
...
...
src/main/java/cn/com/poc/expose/aggregate/AgentApplicationExposeService.java
View file @
ff89faec
package
cn
.
com
.
poc
.
expose
.
aggregate
;
package
cn
.
com
.
poc
.
expose
.
aggregate
;
import
cn.com.poc.agent_application.entity.BizAgentApplicationApiProfileEntity
;
import
cn.com.poc.agent_application.entity.BizAgentApplicationApiProfileEntity
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryItem
;
import
cn.com.poc.agent_application.query.MemberCollectQueryItem
;
import
cn.com.poc.agent_application.query.MemberCollectQueryItem
;
import
cn.com.yict.framemax.data.model.PagingInfo
;
import
cn.com.yict.framemax.data.model.PagingInfo
;
...
@@ -83,4 +84,12 @@ public interface AgentApplicationExposeService {
...
@@ -83,4 +84,12 @@ public interface AgentApplicationExposeService {
*/
*/
BizAgentApplicationApiProfileEntity
resetApiProfile
(
Long
memberId
)
throws
Exception
;
BizAgentApplicationApiProfileEntity
resetApiProfile
(
Long
memberId
)
throws
Exception
;
/**
* 获取首页插件
*
* @param search
* @return
*/
List
<
AgentApplicationIndexPluginQueryItem
>
getHomePlugins
(
String
search
);
}
}
src/main/java/cn/com/poc/expose/aggregate/impl/AgentApplicationExposeServiceImpl.java
View file @
ff89faec
...
@@ -4,6 +4,8 @@ import cn.com.poc.agent_application.aggregate.AgentApplicationService;
...
@@ -4,6 +4,8 @@ import cn.com.poc.agent_application.aggregate.AgentApplicationService;
import
cn.com.poc.agent_application.constant.AgentApplicationDialoguesRecordConstants
;
import
cn.com.poc.agent_application.constant.AgentApplicationDialoguesRecordConstants
;
import
cn.com.poc.agent_application.constant.AgentApplicationGCConfigConstants
;
import
cn.com.poc.agent_application.constant.AgentApplicationGCConfigConstants
;
import
cn.com.poc.agent_application.entity.*
;
import
cn.com.poc.agent_application.entity.*
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryCondition
;
import
cn.com.poc.agent_application.query.AgentApplicationIndexPluginQueryItem
;
import
cn.com.poc.agent_application.query.MemberCollectQueryCondition
;
import
cn.com.poc.agent_application.query.MemberCollectQueryCondition
;
import
cn.com.poc.agent_application.query.MemberCollectQueryItem
;
import
cn.com.poc.agent_application.query.MemberCollectQueryItem
;
import
cn.com.poc.agent_application.service.*
;
import
cn.com.poc.agent_application.service.*
;
...
@@ -107,6 +109,9 @@ public class AgentApplicationExposeServiceImpl implements AgentApplicationExpose
...
@@ -107,6 +109,9 @@ public class AgentApplicationExposeServiceImpl implements AgentApplicationExpose
@Resource
@Resource
private
BizAgentApplicationApiProfileService
bizAgentApplicationApiProfileService
;
private
BizAgentApplicationApiProfileService
bizAgentApplicationApiProfileService
;
@Resource
private
BizAgentApplicationIndexPluginService
bizAgentApplicationIndexPluginService
;
@Override
@Override
public
void
callAgentApplication
(
String
agentId
,
String
dialogsId
,
String
input
,
List
<
String
>
fileUrls
,
String
channel
,
String
imageUrl
,
HttpServletResponse
httpServletResponse
)
throws
Exception
{
public
void
callAgentApplication
(
String
agentId
,
String
dialogsId
,
String
input
,
List
<
String
>
fileUrls
,
String
channel
,
String
imageUrl
,
HttpServletResponse
httpServletResponse
)
throws
Exception
{
...
@@ -375,6 +380,14 @@ public class AgentApplicationExposeServiceImpl implements AgentApplicationExpose
...
@@ -375,6 +380,14 @@ public class AgentApplicationExposeServiceImpl implements AgentApplicationExpose
return
bizAgentApplicationApiProfileService
.
resetProfile
(
memberId
);
return
bizAgentApplicationApiProfileService
.
resetProfile
(
memberId
);
}
}
@Override
public
List
<
AgentApplicationIndexPluginQueryItem
>
getHomePlugins
(
String
search
)
{
AgentApplicationIndexPluginQueryCondition
condition
=
new
AgentApplicationIndexPluginQueryCondition
();
condition
.
setSearch
(
search
);
return
bizAgentApplicationIndexPluginService
.
agentApplicationIndexPluginQuery
(
condition
);
}
private
void
createCNQuestion
()
{
private
void
createCNQuestion
()
{
Message
message
=
new
Message
();
Message
message
=
new
Message
();
message
.
setRole
(
LLMRoleEnum
.
USER
.
getRole
());
message
.
setRole
(
LLMRoleEnum
.
USER
.
getRole
());
...
...
src/main/java/cn/com/poc/expose/dto/IndexPluginDto.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
expose
.
dto
;
/**
* @author alex.yao
* @date 2025/5/14
*/
public
class
IndexPluginDto
{
private
String
agentId
;
private
String
agentTitle
;
private
String
agentDesc
;
private
String
agentAvatar
;
public
String
getAgentId
()
{
return
agentId
;
}
public
void
setAgentId
(
String
agentId
)
{
this
.
agentId
=
agentId
;
}
public
String
getAgentTitle
()
{
return
agentTitle
;
}
public
void
setAgentTitle
(
String
agentTitle
)
{
this
.
agentTitle
=
agentTitle
;
}
public
String
getAgentDesc
()
{
return
agentDesc
;
}
public
void
setAgentDesc
(
String
agentDesc
)
{
this
.
agentDesc
=
agentDesc
;
}
public
String
getAgentAvatar
()
{
return
agentAvatar
;
}
public
void
setAgentAvatar
(
String
agentAvatar
)
{
this
.
agentAvatar
=
agentAvatar
;
}
}
src/main/java/cn/com/poc/expose/rest/AgentApplicationRest.java
View file @
ff89faec
...
@@ -115,4 +115,12 @@ public interface AgentApplicationRest extends BaseRest {
...
@@ -115,4 +115,12 @@ public interface AgentApplicationRest extends BaseRest {
* @return 应用API配置
* @return 应用API配置
*/
*/
AgentApplicationApiProfileDto
resetApiProfile
()
throws
Exception
;
AgentApplicationApiProfileDto
resetApiProfile
()
throws
Exception
;
/**
* 获取首页插件
*
* @param search 搜索关键字
*/
List
<
IndexPluginDto
>
getHomePlugins
(
@RequestParam
(
required
=
false
)
String
search
);
}
}
src/main/java/cn/com/poc/expose/rest/impl/AgentApplicationRestImpl.java
View file @
ff89faec
...
@@ -288,4 +288,21 @@ public class AgentApplicationRestImpl implements AgentApplicationRest {
...
@@ -288,4 +288,21 @@ public class AgentApplicationRestImpl implements AgentApplicationRest {
result
.
setApiSecret
(
apiProfileEntity
.
getApiSecret
());
result
.
setApiSecret
(
apiProfileEntity
.
getApiSecret
());
return
result
;
return
result
;
}
}
@Override
public
List
<
IndexPluginDto
>
getHomePlugins
(
String
search
)
{
List
<
AgentApplicationIndexPluginQueryItem
>
homePlugins
=
agentApplicationExposeService
.
getHomePlugins
(
search
);
List
<
IndexPluginDto
>
result
=
new
ArrayList
<>();
if
(
CollectionUtils
.
isNotEmpty
(
homePlugins
))
{
result
=
homePlugins
.
stream
().
map
(
item
->
{
IndexPluginDto
dto
=
new
IndexPluginDto
();
dto
.
setAgentId
(
item
.
getAgentId
());
dto
.
setAgentTitle
(
item
.
getAgentTitle
());
dto
.
setAgentDesc
(
item
.
getAgentDesc
());
dto
.
setAgentAvatar
(
item
.
getAgentAvatar
());
return
dto
;
}).
collect
(
Collectors
.
toList
());
}
return
result
;
}
}
}
src/main/java/cn/com/poc/expose/rest/impl/ContentReportRestImpl.java
View file @
ff89faec
...
@@ -6,22 +6,30 @@ import cn.com.poc.common.utils.UUIDTool;
...
@@ -6,22 +6,30 @@ import cn.com.poc.common.utils.UUIDTool;
import
cn.com.poc.expose.dto.ContentReportDto
;
import
cn.com.poc.expose.dto.ContentReportDto
;
import
cn.com.poc.expose.rest.ContentReportRest
;
import
cn.com.poc.expose.rest.ContentReportRest
;
import
cn.com.yict.framemax.core.exception.BusinessException
;
import
cn.com.yict.framemax.core.exception.BusinessException
;
import
cn.hutool.Hutool
;
import
cn.hutool.poi.word.Word07Writer
;
import
com.itextpdf.text.pdf.PdfWriter
;
import
com.itextpdf.text.pdf.PdfWriter
;
import
com.vladsch.flexmark.html.HtmlRenderer
;
import
com.vladsch.flexmark.html.HtmlRenderer
;
import
com.vladsch.flexmark.parser.Parser
;
import
com.vladsch.flexmark.parser.Parser
;
import
com.vladsch.flexmark.util.ast.Document
;
import
com.vladsch.flexmark.util.ast.Document
;
import
com.vladsch.flexmark.util.ast.Node
;
import
org.apache.pdfbox.io.RandomAccessBuffer
;
import
org.apache.pdfbox.io.RandomAccessBuffer
;
import
org.apache.pdfbox.io.RandomAccessBufferedFileInputStream
;
import
org.apache.pdfbox.io.RandomAccessBufferedFileInputStream
;
import
org.apache.pdfbox.pdfparser.PDFParser
;
import
org.apache.pdfbox.pdfparser.PDFParser
;
import
org.apache.pdfbox.pdmodel.PDDocument
;
import
org.apache.pdfbox.pdmodel.PDDocument
;
import
org.apache.poi.openxml4j.exceptions.InvalidFormatException
;
import
org.apache.poi.poifs.filesystem.DirectoryEntry
;
import
org.apache.poi.poifs.filesystem.DirectoryEntry
;
import
org.apache.poi.poifs.filesystem.DocumentEntry
;
import
org.apache.poi.poifs.filesystem.POIFSFileSystem
;
import
org.apache.poi.poifs.filesystem.POIFSFileSystem
;
import
org.apache.poi.util.Units
;
import
org.apache.poi.xwpf.usermodel.*
;
import
org.apache.poi.xwpf.usermodel.*
;
import
org.springframework.stereotype.Component
;
import
org.springframework.stereotype.Component
;
import
javax.annotation.Resource
;
import
javax.annotation.Resource
;
import
java.awt.*
;
import
java.io.*
;
import
java.io.*
;
import
java.nio.charset.StandardCharsets
;
import
java.nio.charset.StandardCharsets
;
import
java.util.Base64
;
import
java.util.regex.Matcher
;
import
java.util.regex.Matcher
;
import
java.util.regex.Pattern
;
import
java.util.regex.Pattern
;
...
@@ -47,7 +55,7 @@ public class ContentReportRestImpl implements ContentReportRest {
...
@@ -47,7 +55,7 @@ public class ContentReportRestImpl implements ContentReportRest {
String
result
=
""
;
String
result
=
""
;
if
(
"docx"
.
equals
(
reportType
)
||
"doc"
.
equals
(
reportType
))
{
if
(
"docx"
.
equals
(
reportType
)
||
"doc"
.
equals
(
reportType
))
{
String
htmlContent
=
convertMarkdownToHtml
(
markdown
);
String
htmlContent
=
convertMarkdownToHtml
(
markdown
);
result
=
convertHtmlTo
Word
(
htmlContent
);
result
=
export
Word
(
htmlContent
);
}
else
if
(
"html"
.
equals
(
reportType
))
{
}
else
if
(
"html"
.
equals
(
reportType
))
{
String
htmlContent
=
convertMarkdownToHtml
(
markdown
);
String
htmlContent
=
convertMarkdownToHtml
(
markdown
);
result
=
bosConfigService
.
uploadFileByByteArray2Oss
(
htmlContent
.
getBytes
(),
UUIDTool
.
getUUID
(),
reportType
);
result
=
bosConfigService
.
uploadFileByByteArray2Oss
(
htmlContent
.
getBytes
(),
UUIDTool
.
getUUID
(),
reportType
);
...
@@ -71,15 +79,46 @@ public class ContentReportRestImpl implements ContentReportRest {
...
@@ -71,15 +79,46 @@ public class ContentReportRestImpl implements ContentReportRest {
return
renderer
.
render
(
document
);
return
renderer
.
render
(
document
);
}
}
private
String
convertHtmlToWord
(
String
html
)
throws
IOException
{
File
file
=
File
.
createTempFile
(
UUIDTool
.
getUUID
(),
".docx"
);
// private String convertHtmlToWord(String html) throws IOException {
FileOutputStream
outputStream
=
new
FileOutputStream
(
file
);
// File file = File.createTempFile(UUIDTool.getUUID(), ".docx");
ByteArrayInputStream
bais
=
new
ByteArrayInputStream
(
html
.
getBytes
());
//将字节数组包装到流中
// Word07Writer writer = new Word07Writer();
// writer.addText(new Font("宋体", Font.PLAIN, 10), html);
// writer.flush(file);
// FileInputStream fileInputStream = new FileInputStream(file);
// String upload = bosConfigService.upload(fileInputStream, "docx", "application/msword");
// file.deleteOnExit();
// fileInputStream.close();
// return upload;
// }
/**
* @param content 富文本内容转word
* @throws Exception
*/
public
String
exportWord
(
String
content
)
throws
IOException
{
if
(!
content
.
startsWith
(
"<html>"
))
{
content
=
"<html>"
+
content
;
}
if
(!
content
.
contains
(
"<body>"
))
{
content
=
content
.
replaceFirst
(
"<html>"
,
"<html><body>"
)
+
"</body></html>"
;
}
byte
b
[]
=
content
.
getBytes
(
"GBK"
);
//这里是必须要设置编码的,不然导出中文就会乱码。
ByteArrayInputStream
bais
=
new
ByteArrayInputStream
(
b
);
//将字节数组包装到流中
POIFSFileSystem
poifs
=
new
POIFSFileSystem
();
POIFSFileSystem
poifs
=
new
POIFSFileSystem
();
DirectoryEntry
directory
=
poifs
.
getRoot
();
DirectoryEntry
directory
=
poifs
.
getRoot
();
directory
.
createDocument
(
"WordDocument"
,
bais
);
DocumentEntry
documentEntry
=
directory
.
createDocument
(
"WordDocument"
,
bais
);
//该步骤不可省略,否则会出现乱码。
poifs
.
writeFilesystem
(
outputStream
);
//输出文件
File
file
=
File
.
createTempFile
(
UUIDTool
.
getUUID
(),
".docx"
);
FileOutputStream
ostream
=
new
FileOutputStream
(
file
);
poifs
.
writeFilesystem
(
ostream
);
FileInputStream
fileInputStream
=
new
FileInputStream
(
file
);
FileInputStream
fileInputStream
=
new
FileInputStream
(
file
);
return
bosConfigService
.
upload
(
fileInputStream
,
"docx"
,
"application/msword"
);
String
upload
=
bosConfigService
.
upload
(
fileInputStream
,
"docx"
,
"application/msword"
);
bais
.
close
();
ostream
.
close
();
poifs
.
close
();
fileInputStream
.
close
();
return
upload
;
}
}
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/LargeModelFunctionEnum.java
View file @
ff89faec
...
@@ -4,8 +4,10 @@ import cn.com.poc.common.utils.SpringUtils;
...
@@ -4,8 +4,10 @@ import cn.com.poc.common.utils.SpringUtils;
import
cn.com.poc.thirdparty.resource.demand.ai.function.calculator.CalculatorFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.calculator.CalculatorFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.document_reader.DocumentReaderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.document_reader.DocumentReaderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.document_understanding.DocumentUnderstandIngFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.document_understanding.DocumentUnderstandIngFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.ContractExtractionFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.html_reader.HtmlReaderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.html_reader.HtmlReaderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.image_ocr.ImageOCRFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.image_ocr.ImageOCRFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.long_document_reader.LongDocumentReaderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.long_memory.SetLongMemoryFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.long_memory.SetLongMemoryFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.memory_variable_writer.MemoryVariableWriterFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.memory_variable_writer.MemoryVariableWriterFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.notification_reminder.NotificationReminderFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.notification_reminder.NotificationReminderFunction
;
...
@@ -43,6 +45,10 @@ public enum LargeModelFunctionEnum {
...
@@ -43,6 +45,10 @@ public enum LargeModelFunctionEnum {
pdf_to_md
(
PdfToMDFunction
.
class
),
pdf_to_md
(
PdfToMDFunction
.
class
),
long_document_reader
(
LongDocumentReaderFunction
.
class
),
contract_extraction
(
ContractExtractionFunction
.
class
),
;
;
private
Class
<?
extends
AbstractLargeModelFunction
>
function
;
private
Class
<?
extends
AbstractLargeModelFunction
>
function
;
...
...
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/document_understanding/DocumentUnderstandIngFunction.java
View file @
ff89faec
...
@@ -30,11 +30,7 @@ public class DocumentUnderstandIngFunction extends AbstractLargeModelFunction {
...
@@ -30,11 +30,7 @@ public class DocumentUnderstandIngFunction extends AbstractLargeModelFunction {
private
final
String
MODEL
=
"qwen-long"
;
private
final
String
MODEL
=
"qwen-long"
;
private
final
String
TEMPLATE
=
"# 工作规范:\n 1.工作流程:①对提供的文档内容进行理解,支持信息检索、摘要总结、文本分析。②根据用户提出的问题,提取或者总结文档中与问题相关的内容。2.工作限制:①要将问题与文档内容精准匹配,在理解文档时要带着问题去理解 \n\n"
+
private
final
String
TEMPLATE
=
"# 工作规范:\n 1.工作流程:①对提供的文档内容进行理解,支持信息检索、摘要总结、文本分析。②根据用户提出的问题,提取或者总结文档中与问题相关的内容。2.工作限制:①要将问题与文档内容精准匹配,在理解文档时要带着问题去理解 \n\n"
;
"## 文档内容\n"
+
"${document_content}\n"
+
"## 用户问题\n"
+
"${question}"
;
private
final
String
DESC
=
"仅支持文档doc、docx、pdf、txt、md、xlsx、csv、xls,解析长文档内容理解,支持信息检索、摘要总结、文本分析能力,不可解析网页"
;
private
final
String
DESC
=
"仅支持文档doc、docx、pdf、txt、md、xlsx、csv、xls,解析长文档内容理解,支持信息检索、摘要总结、文本分析能力,不可解析网页"
;
...
@@ -52,28 +48,44 @@ public class DocumentUnderstandIngFunction extends AbstractLargeModelFunction {
...
@@ -52,28 +48,44 @@ public class DocumentUnderstandIngFunction extends AbstractLargeModelFunction {
@Override
@Override
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
result
.
setFunctionResult
(
StringUtils
.
EMPTY
);
result
.
setPromptContent
(
StringUtils
.
EMPTY
);
if
(
StringUtils
.
isBlank
(
content
))
{
if
(
StringUtils
.
isBlank
(
content
))
{
result
.
setFunctionResult
(
StringUtils
.
EMPTY
);
result
.
setPromptContent
(
StringUtils
.
EMPTY
);
return
result
;
return
result
;
}
}
JSONObject
jsonObject
=
JSON
.
parseObject
(
content
);
JSONObject
jsonObject
=
JSON
.
parseObject
(
content
);
if
(!
jsonObject
.
containsKey
(
"question"
)
||
!
jsonObject
.
containsKey
(
"file_url"
))
{
return
result
;
}
String
question
=
jsonObject
.
getString
(
"question"
);
String
question
=
jsonObject
.
getString
(
"question"
);
String
fileUrl
=
jsonObject
.
getString
(
"file_url"
);
String
fileUrl
=
jsonObject
.
getString
(
"file_url"
);
File
file
=
DocumentLoad
.
downloadURLDocument
(
fileUrl
);
File
file
=
DocumentLoad
.
downloadURLDocument
(
fileUrl
);
String
documentContent
;
String
documentContent
;
try
{
try
{
documentContent
=
DocumentLoad
.
documentToText
(
file
);
documentContent
=
DocumentLoad
.
documentToText
(
file
);
if
(
StringUtils
.
isBlank
(
documentContent
))
{
return
result
;
}
}
catch
(
Exception
e
)
{
}
catch
(
Exception
e
)
{
documentContent
=
StringUtils
.
EMPTY
;
return
result
;
}
}
Message
systemMessage
=
new
Message
();
systemMessage
.
setRole
(
"system"
);
systemMessage
.
setContent
(
TEMPLATE
);
Message
fileContentMessage
=
new
Message
();
fileContentMessage
.
setRole
(
"system"
);
fileContentMessage
.
setContent
(
documentContent
);
Message
message
=
new
Message
();
Message
message
=
new
Message
();
message
.
setRole
(
"user"
);
message
.
setRole
(
"user"
);
message
.
setContent
(
TEMPLATE
.
replace
(
"${document_content}"
,
documentContent
).
replace
(
"${question}"
,
question
)
);
message
.
setContent
(
question
);
LargeModelResponse
largeModelResponse
=
new
LargeModelResponse
();
LargeModelResponse
largeModelResponse
=
new
LargeModelResponse
();
largeModelResponse
.
setModel
(
MODEL
);
largeModelResponse
.
setModel
(
MODEL
);
largeModelResponse
.
setMessages
(
new
Message
[]{
message
});
largeModelResponse
.
setMessages
(
new
Message
[]{
systemMessage
,
fileContentMessage
,
message
});
largeModelResponse
.
setStream
(
false
);
largeModelResponse
.
setStream
(
false
);
largeModelResponse
.
setUser
(
"Document_Understanding"
);
largeModelResponse
.
setUser
(
"Document_Understanding"
);
LargeModelDemandResult
largeModelDemandResult
=
llmService
.
chat
(
largeModelResponse
);
LargeModelDemandResult
largeModelDemandResult
=
llmService
.
chat
(
largeModelResponse
);
...
...
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/extraction/ContractExtractionFunction.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
extraction
;
import
cn.com.poc.agent_application.entity.Variable
;
import
cn.com.poc.common.utils.JsonUtils
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.AbstractFunctionResult
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.AbstractLargeModelFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.FunctionLLMConfig
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Parameters
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Properties
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.entity.KeyInfo
;
import
cn.com.poc.thirdparty.resource.textin.api.TextInClient
;
import
cn.hutool.core.collection.ListUtil
;
import
cn.hutool.json.JSONException
;
import
com.alibaba.fastjson.JSONArray
;
import
com.alibaba.fastjson.JSONObject
;
import
org.springframework.stereotype.Component
;
import
java.util.ArrayList
;
import
java.util.List
;
/**
* 合同关键信息抽取-要素提取
*
* @author alex.yao
* @date 2025/5/12
*/
@Component
public
class
ContractExtractionFunction
extends
AbstractLargeModelFunction
{
private
final
String
DESC
=
"合同关键信息抽取"
;
private
final
TextInClient
textInClient
=
new
TextInClient
();
private
final
FunctionLLMConfig
functionLLMConfig
=
new
FunctionLLMConfig
.
FunctionLLMConfigBuilder
()
.
name
(
"contract_extraction"
)
.
parameters
(
new
Parameters
(
"array"
)
.
addProperties
(
"file_url"
,
new
Properties
(
"string"
,
"文件链接, 合同文件的在线地址"
))
.
addProperties
(
"key_info"
,
new
Properties
(
"string"
,
"关键信息名称, 长度限制20个字符"
))
.
addProperties
(
"paraphrase_names"
,
new
Properties
(
"array"
,
"相似名字段,字符串数组, 可根据相似名精准抽取关键信息, 最多填写3个,每个释义名称长度限制20个字符"
))
.
addProperties
(
"field_type"
,
new
Properties
(
"string"
,
"字段类型字段, 可选项有,时间:time, 金额:amount, 地址:address, 公司:company, 姓名:name, 描述(长文本):long_text_description, 其他:other, 印章:stamp, 分别对应产品段配置的字段类型"
))
.
addProperties
(
"keywords"
,
new
Properties
(
"array"
,
"关键字字段, 字符串数组, 可根据关键字信息,快速定位抽取信所在段落范围, 最多填写10个,且字符总长度不超过50"
))
)
.
description
(
DESC
)
.
build
();
@Override
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
String
fileUrl
;
List
<
KeyInfo
>
keyInfos
=
new
ArrayList
<>();
if
(
isJsonArray
(
content
))
{
JSONArray
jsonArray
=
JSONArray
.
parseArray
(
content
);
if
(
jsonArray
.
isEmpty
())
{
return
result
;
}
fileUrl
=
jsonArray
.
getJSONObject
(
0
).
getString
(
"file_url"
);
for
(
int
i
=
0
;
i
<
jsonArray
.
size
();
i
++)
{
JSONObject
jsonObject
=
jsonArray
.
getJSONObject
(
i
);
KeyInfo
keyInfo
=
new
KeyInfo
();
if
(
jsonObject
.
containsKey
(
"field_type"
))
{
keyInfo
.
setField_type
(
jsonObject
.
getString
(
"file_type"
));
}
if
(
jsonObject
.
containsKey
(
"key_info"
))
{
keyInfo
.
setKey_info
(
jsonObject
.
getString
(
"key_info"
));
}
if
(
jsonObject
.
containsKey
(
"paraphrase_names"
))
{
keyInfo
.
setParaphrase_names
(
jsonObject
.
getJSONArray
(
"paraphrase_names"
).
toArray
(
new
String
[
0
]));
}
if
(
jsonObject
.
containsKey
(
"keywords"
))
{
keyInfo
.
setKeywords
(
jsonObject
.
getJSONArray
(
"keywords"
).
toArray
(
new
String
[
0
]));
}
keyInfos
.
add
(
keyInfo
);
}
}
else
{
JSONObject
jsonObject
=
JSONObject
.
parseObject
(
content
);
fileUrl
=
jsonObject
.
getString
(
"file_url"
);
KeyInfo
keyInfo
=
new
KeyInfo
();
if
(
jsonObject
.
containsKey
(
"field_type"
))
{
keyInfo
.
setField_type
(
jsonObject
.
getString
(
"file_type"
));
}
if
(
jsonObject
.
containsKey
(
"key_info"
))
{
keyInfo
.
setKey_info
(
jsonObject
.
getString
(
"key_info"
));
}
if
(
jsonObject
.
containsKey
(
"paraphrase_names"
))
{
keyInfo
.
setParaphrase_names
(
jsonObject
.
getJSONArray
(
"paraphrase_names"
).
toArray
(
new
String
[
0
]));
}
if
(
jsonObject
.
containsKey
(
"keywords"
))
{
keyInfo
.
setKeywords
(
jsonObject
.
getJSONArray
(
"keywords"
).
toArray
(
new
String
[
0
]));
}
keyInfos
.
add
(
keyInfo
);
}
String
extraction
=
textInClient
.
extraction
(
fileUrl
,
keyInfos
);
result
.
setFunctionResult
(
extraction
);
result
.
setPromptContent
(
extraction
);
return
result
;
}
@Override
public
String
getDesc
()
{
return
DESC
;
}
@Override
public
List
<
String
>
getLLMConfig
()
{
return
ListUtil
.
toList
(
JsonUtils
.
serialize
(
functionLLMConfig
));
}
@Override
public
List
<
String
>
getLLMConfig
(
List
<
Variable
>
variableStructure
)
{
return
this
.
getLLMConfig
();
}
private
boolean
isJsonArray
(
String
json
)
{
try
{
new
cn
.
hutool
.
json
.
JSONArray
(
json
);
return
true
;
}
catch
(
JSONException
e
)
{
return
false
;
}
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/extraction/entity/Config.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
extraction
.
entity
;
/**
* @author alex.yao
* @date 2025/5/12
*/
public
class
Config
{
public
String
engine
;
public
String
use_pdf_parser
;
public
String
use_semantic_match
;
public
String
remove_watermark
;
public
Config
(
String
engine
,
String
use_pdf_parser
,
String
use_semantic_match
,
String
remove_watermark
)
{
this
.
engine
=
engine
;
this
.
use_pdf_parser
=
use_pdf_parser
;
this
.
use_semantic_match
=
use_semantic_match
;
this
.
remove_watermark
=
remove_watermark
;
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/extraction/entity/KeyInfo.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
extraction
.
entity
;
/**
* @author alex.yao
* @date 2025/5/12
*/
public
class
KeyInfo
{
public
String
key_info
;
public
String
[]
paraphrase_names
;
public
String
field_type
;
public
boolean
is_in_table
;
public
String
[]
keywords
;
public
KeyInfo
()
{
}
public
KeyInfo
(
String
key_info
,
String
[]
paraphrase_names
,
String
field_type
,
boolean
is_in_table
,
String
[]
keywords
)
{
this
.
key_info
=
key_info
;
this
.
paraphrase_names
=
paraphrase_names
;
this
.
field_type
=
field_type
;
this
.
is_in_table
=
is_in_table
;
this
.
keywords
=
keywords
;
}
public
String
getKey_info
()
{
return
key_info
;
}
public
void
setKey_info
(
String
key_info
)
{
this
.
key_info
=
key_info
;
}
public
String
[]
getParaphrase_names
()
{
return
paraphrase_names
;
}
public
void
setParaphrase_names
(
String
[]
paraphrase_names
)
{
this
.
paraphrase_names
=
paraphrase_names
;
}
public
String
getField_type
()
{
return
field_type
;
}
public
void
setField_type
(
String
field_type
)
{
this
.
field_type
=
field_type
;
}
public
boolean
isIs_in_table
()
{
return
is_in_table
;
}
public
void
setIs_in_table
(
boolean
is_in_table
)
{
this
.
is_in_table
=
is_in_table
;
}
public
String
[]
getKeywords
()
{
return
keywords
;
}
public
void
setKeywords
(
String
[]
keywords
)
{
this
.
keywords
=
keywords
;
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/extraction/entity/RequestData.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
extraction
.
entity
;
/**
* @author alex.yao
* @date 2025/5/12
*/
public
class
RequestData
{
public
Integer
token_mode
;
public
String
creator
;
public
Config
config
;
public
String
filedata
;
public
String
filename
;
public
KeyInfo
[]
key_info_list
;
public
RequestData
(
String
creator
,
Config
config
,
String
filedata
,
String
filename
,
KeyInfo
[]
key_info_list
)
{
token_mode
=
1
;
this
.
creator
=
creator
;
this
.
config
=
config
;
this
.
filedata
=
filedata
;
this
.
filename
=
filename
;
this
.
key_info_list
=
key_info_list
;
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/image_ocr/ImageOCRFunction.java
View file @
ff89faec
...
@@ -50,6 +50,12 @@ public class ImageOCRFunction extends AbstractLargeModelFunction {
...
@@ -50,6 +50,12 @@ public class ImageOCRFunction extends AbstractLargeModelFunction {
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
JSONObject
jsonObject
=
JSONObject
.
parseObject
(
content
);
JSONObject
jsonObject
=
JSONObject
.
parseObject
(
content
);
if
(!
jsonObject
.
containsKey
(
"query"
)
||
!
jsonObject
.
containsKey
(
"image_url"
))
{
result
.
setPromptContent
(
content
);
result
.
setFunctionResult
(
content
);
return
result
;
}
Message
systemMessage
=
new
Message
();
Message
systemMessage
=
new
Message
();
systemMessage
.
setRole
(
MessageRoleConstant
.
system
);
systemMessage
.
setRole
(
MessageRoleConstant
.
system
);
...
...
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/long_document_reader/LongDocumentReaderFunction.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
long_document_reader
;
import
cn.com.poc.agent_application.entity.Variable
;
import
cn.com.poc.common.utils.DocumentLoad
;
import
cn.com.poc.common.utils.JsonUtils
;
import
cn.com.poc.common.utils.StringUtils
;
import
cn.com.poc.thirdparty.resource.demand.ai.entity.dialogue.Message
;
import
cn.com.poc.thirdparty.resource.demand.ai.entity.largemodel.LargeModelDemandResult
;
import
cn.com.poc.thirdparty.resource.demand.ai.entity.largemodel.LargeModelResponse
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.AbstractFunctionResult
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.AbstractLargeModelFunction
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.FunctionLLMConfig
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Parameters
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Properties
;
import
cn.com.poc.thirdparty.service.LLMService
;
import
cn.hutool.core.collection.ListUtil
;
import
com.alibaba.fastjson.JSON
;
import
com.alibaba.fastjson.JSONObject
;
import
org.springframework.stereotype.Component
;
import
javax.annotation.Resource
;
import
java.io.File
;
import
java.util.List
;
/**
* @author alex.yao
* @date 2025/5/12
*/
@Component
public
class
LongDocumentReaderFunction
extends
AbstractLargeModelFunction
{
@Resource
private
LLMService
llmService
;
private
final
String
MODEL
=
"qwen-long"
;
private
final
String
DESC
=
"长文档理解,适合用于文档内容长、文件大的文档,结合用户问题与文档理解的插件,仅支持文档doc、docx、pdf、txt、md、xlsx、csv、xls"
;
private
final
String
SYSTEM_PROMPT
=
"# 工作规范:\n 1.工作流程:①对提供的文档内容进行理解,支持信息检索、摘要总结、文本分析。②根据用户提出的问题,提取或者总结文档中与问题相关的内容。2.工作限制:①要将问题与文档内容精准匹配,在理解文档时要带着问题去理解 \n\n"
;
private
final
FunctionLLMConfig
functionLLMConfig
=
new
FunctionLLMConfig
.
FunctionLLMConfigBuilder
()
.
name
(
"long_document_reader"
)
.
description
(
DESC
)
.
parameters
(
new
Parameters
(
"object"
)
.
addProperties
(
"question"
,
new
Properties
(
"string"
,
"用户的问题"
))
.
addProperties
(
"file_url"
,
new
Properties
(
"string"
,
"doc、docx、pdf、txt、md、xlsx、csv、xls文件地址"
))
).
build
();
@Override
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<>();
result
.
setFunctionResult
(
StringUtils
.
EMPTY
);
result
.
setPromptContent
(
StringUtils
.
EMPTY
);
if
(
StringUtils
.
isBlank
(
content
))
{
return
result
;
}
JSONObject
jsonObject
=
JSON
.
parseObject
(
content
);
if
(!
jsonObject
.
containsKey
(
"question"
)
||
!
jsonObject
.
containsKey
(
"file_url"
))
{
return
result
;
}
String
question
=
jsonObject
.
getString
(
"question"
);
String
fileUrl
=
jsonObject
.
getString
(
"file_url"
);
File
file
=
DocumentLoad
.
downloadURLDocument
(
fileUrl
);
String
documentContent
;
try
{
documentContent
=
DocumentLoad
.
documentToText
(
file
);
}
catch
(
Exception
e
)
{
return
result
;
}
Message
systemMessage
=
new
Message
();
systemMessage
.
setRole
(
"system"
);
systemMessage
.
setContent
(
SYSTEM_PROMPT
);
Message
documentMessage
=
new
Message
();
documentMessage
.
setRole
(
"system"
);
documentMessage
.
setContent
(
documentContent
);
Message
userMessage
=
new
Message
();
userMessage
.
setContent
(
question
);
userMessage
.
setRole
(
"user"
);
Message
[]
messages
=
new
Message
[]{
systemMessage
,
documentMessage
,
userMessage
};
LargeModelResponse
largeModelResponse
=
new
LargeModelResponse
();
largeModelResponse
.
setModel
(
MODEL
);
largeModelResponse
.
setMessages
(
messages
);
largeModelResponse
.
setStream
(
false
);
largeModelResponse
.
setUser
(
"Long_Document_Reader"
);
LargeModelDemandResult
largeModelDemandResult
=
llmService
.
chat
(
largeModelResponse
);
if
(
largeModelDemandResult
==
null
)
{
result
.
setFunctionResult
(
StringUtils
.
EMPTY
);
result
.
setPromptContent
(
StringUtils
.
EMPTY
);
return
result
;
}
result
.
setFunctionResult
(
largeModelDemandResult
.
getMessage
());
result
.
setPromptContent
(
largeModelDemandResult
.
getMessage
());
return
result
;
}
@Override
public
String
getDesc
()
{
return
DESC
;
}
@Override
public
List
<
String
>
getLLMConfig
()
{
return
ListUtil
.
toList
(
JsonUtils
.
serialize
(
this
.
functionLLMConfig
));
}
@Override
public
List
<
String
>
getLLMConfig
(
List
<
Variable
>
variableStructure
)
{
return
this
.
getLLMConfig
();
}
}
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/text_in_pdf2md/PdfToMDFunction.java
View file @
ff89faec
...
@@ -7,7 +7,7 @@ import cn.com.poc.thirdparty.resource.demand.ai.function.AbstractLargeModelFunct
...
@@ -7,7 +7,7 @@ import cn.com.poc.thirdparty.resource.demand.ai.function.AbstractLargeModelFunct
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.FunctionLLMConfig
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.FunctionLLMConfig
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Parameters
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Parameters
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Properties
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.entity.Properties
;
import
cn.com.poc.thirdparty.resource.
demand.ai.function.text_in_pdf2md.api.OCR
Client
;
import
cn.com.poc.thirdparty.resource.
textin.api.TextIn
Client
;
import
cn.hutool.core.collection.ListUtil
;
import
cn.hutool.core.collection.ListUtil
;
import
com.alibaba.fastjson.JSONObject
;
import
com.alibaba.fastjson.JSONObject
;
import
com.fasterxml.jackson.databind.JsonNode
;
import
com.fasterxml.jackson.databind.JsonNode
;
...
@@ -43,6 +43,11 @@ public class PdfToMDFunction extends AbstractLargeModelFunction {
...
@@ -43,6 +43,11 @@ public class PdfToMDFunction extends AbstractLargeModelFunction {
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
public
AbstractFunctionResult
<
String
>
doFunction
(
String
content
,
String
identifier
)
{
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<
String
>();
AbstractFunctionResult
<
String
>
result
=
new
AbstractFunctionResult
<
String
>();
JSONObject
jsonObject
=
JSONObject
.
parseObject
(
content
);
JSONObject
jsonObject
=
JSONObject
.
parseObject
(
content
);
if
(!
jsonObject
.
containsKey
(
"file_url"
))
{
result
.
setPromptContent
(
content
);
result
.
setFunctionResult
(
content
);
return
result
;
}
String
url
=
jsonObject
.
getString
(
"file_url"
);
String
url
=
jsonObject
.
getString
(
"file_url"
);
byte
[]
fileContent
=
url
.
getBytes
(
StandardCharsets
.
UTF_8
);
byte
[]
fileContent
=
url
.
getBytes
(
StandardCharsets
.
UTF_8
);
HashMap
<
String
,
Object
>
options
=
new
HashMap
<>();
HashMap
<
String
,
Object
>
options
=
new
HashMap
<>();
...
@@ -58,15 +63,19 @@ public class PdfToMDFunction extends AbstractLargeModelFunction {
...
@@ -58,15 +63,19 @@ public class PdfToMDFunction extends AbstractLargeModelFunction {
options
.
put
(
"paratext_mode"
,
"annotation"
);
options
.
put
(
"paratext_mode"
,
"annotation"
);
options
.
put
(
"parse_mode"
,
"auto"
);
options
.
put
(
"parse_mode"
,
"auto"
);
options
.
put
(
"table_flavor"
,
"md"
);
options
.
put
(
"table_flavor"
,
"md"
);
OCRClient
client
=
new
OCRClient
();
try
{
try
{
String
response
=
client
.
recognize
(
fileContent
,
options
);
TextInClient
textInClient
=
new
TextInClient
();
String
response
=
textInClient
.
OCR
(
fileContent
,
options
);
ObjectMapper
mapper
=
new
ObjectMapper
();
ObjectMapper
mapper
=
new
ObjectMapper
();
JsonNode
jsonNode
=
mapper
.
readTree
(
response
);
JsonNode
jsonNode
=
mapper
.
readTree
(
response
);
if
(
jsonNode
.
has
(
"result"
)
&&
jsonNode
.
get
(
"result"
).
has
(
"markdown"
))
{
if
(
jsonNode
.
has
(
"result"
)
&&
jsonNode
.
get
(
"result"
).
has
(
"markdown"
))
{
String
markdown
=
jsonNode
.
get
(
"result"
).
get
(
"markdown"
).
asText
();
String
markdown
=
jsonNode
.
get
(
"result"
).
get
(
"markdown"
).
asText
();
result
.
setPromptContent
(
markdown
);
result
.
setPromptContent
(
markdown
);
result
.
setFunctionResult
(
markdown
);
result
.
setFunctionResult
(
markdown
);
}
else
{
logger
.
warn
(
"text in 文档信息提取异常:{}"
,
response
);
result
.
setFunctionResult
(
response
);
result
.
setPromptContent
(
"FAIL"
);
}
}
return
result
;
return
result
;
}
catch
(
Exception
e
)
{
}
catch
(
Exception
e
)
{
...
...
src/main/java/cn/com/poc/thirdparty/resource/demand/ai/function/text_in_pdf2md/api/OCRClient.java
deleted
100644 → 0
View file @
4da4ae97
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
text_in_pdf2md
.
api
;
/**
* @author alex.yao
* @date 2025/5/7
*/
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
import
java.io.BufferedReader
;
import
java.io.IOException
;
import
java.io.InputStreamReader
;
import
java.io.OutputStream
;
import
java.net.HttpURLConnection
;
import
java.net.URL
;
import
java.net.URLEncoder
;
import
java.util.HashMap
;
import
java.util.Map
;
public
class
OCRClient
{
private
Logger
logger
=
LoggerFactory
.
getLogger
(
OCRClient
.
class
);
private
final
String
appId
=
"dafd04a574230c00ccba61132160de0c"
;
private
final
String
secretCode
=
"3bc03c7e6f9402963e6e71d16d786a9c"
;
private
final
String
baseUrl
=
"https://api.textin.com/ai/service/v1/pdf_to_markdown"
;
public
OCRClient
()
{
}
public
String
recognize
(
byte
[]
fileContent
,
HashMap
<
String
,
Object
>
options
)
throws
IOException
{
StringBuilder
queryParams
=
new
StringBuilder
();
for
(
Map
.
Entry
<
String
,
Object
>
entry
:
options
.
entrySet
())
{
if
(
queryParams
.
length
()
>
0
)
{
queryParams
.
append
(
"&"
);
}
queryParams
.
append
(
URLEncoder
.
encode
(
entry
.
getKey
(),
"UTF-8"
))
.
append
(
"="
)
.
append
(
URLEncoder
.
encode
(
entry
.
getValue
().
toString
(),
"UTF-8"
));
}
String
fullUrl
=
baseUrl
+
(
queryParams
.
length
()
>
0
?
"?"
+
queryParams
:
""
);
URL
url
=
new
URL
(
fullUrl
);
HttpURLConnection
connection
=
(
HttpURLConnection
)
url
.
openConnection
();
connection
.
setRequestMethod
(
"POST"
);
connection
.
setRequestProperty
(
"x-ti-app-id"
,
appId
);
connection
.
setRequestProperty
(
"x-ti-secret-code"
,
secretCode
);
connection
.
setRequestProperty
(
"Content-Type"
,
"text/plain;charset=utf-8"
);
connection
.
setDoOutput
(
true
);
try
(
OutputStream
os
=
connection
.
getOutputStream
())
{
os
.
write
(
fileContent
);
os
.
flush
();
}
int
responseCode
=
connection
.
getResponseCode
();
if
(
responseCode
==
HttpURLConnection
.
HTTP_OK
)
{
try
(
BufferedReader
in
=
new
BufferedReader
(
new
InputStreamReader
(
connection
.
getInputStream
())))
{
StringBuilder
response
=
new
StringBuilder
();
String
inputLine
;
while
((
inputLine
=
in
.
readLine
())
!=
null
)
{
response
.
append
(
inputLine
);
}
return
response
.
toString
();
}
}
else
{
logger
.
error
(
"HTTP request failed with code: {}, Error message view :{}"
,
responseCode
,
"https://www.textin.com/document/pdf_to_markdown"
);
throw
new
IOException
(
"HTTP request failed with code: "
+
responseCode
);
}
}
}
\ No newline at end of file
src/main/java/cn/com/poc/thirdparty/resource/textin/api/TextInClient.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
textin
.
api
;
/**
* @author alex.yao
* @date 2025/5/7
*/
import
cn.com.poc.common.utils.DocumentLoad
;
import
cn.com.poc.common.utils.http.LocalHttpClient
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.entity.Config
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.entity.KeyInfo
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.entity.RequestData
;
import
cn.com.yict.framemax.core.exception.BusinessException
;
import
com.alibaba.fastjson.JSONObject
;
import
com.fasterxml.jackson.databind.ObjectMapper
;
import
org.apache.commons.lang3.StringUtils
;
import
org.apache.http.client.methods.HttpUriRequest
;
import
org.apache.http.client.methods.RequestBuilder
;
import
org.slf4j.Logger
;
import
org.slf4j.LoggerFactory
;
import
java.io.*
;
import
java.net.HttpURLConnection
;
import
java.net.URL
;
import
java.net.URLEncoder
;
import
java.nio.charset.StandardCharsets
;
import
java.nio.file.Files
;
import
java.util.Base64
;
import
java.util.HashMap
;
import
java.util.List
;
import
java.util.Map
;
public
class
TextInClient
{
final
private
Logger
logger
=
LoggerFactory
.
getLogger
(
TextInClient
.
class
);
private
String
appId
=
"dafd04a574230c00ccba61132160de0c"
;
private
String
secretCode
=
"3bc03c7e6f9402963e6e71d16d786a9c"
;
public
TextInClient
()
{
}
public
TextInClient
(
String
appId
,
String
secretCode
)
{
this
.
appId
=
appId
;
this
.
secretCode
=
secretCode
;
}
/**
* ocr
*
* @param fileContent
* @param options
* @return
* @throws IOException
*/
public
String
OCR
(
byte
[]
fileContent
,
HashMap
<
String
,
Object
>
options
)
throws
IOException
{
StringBuilder
queryParams
=
new
StringBuilder
();
for
(
Map
.
Entry
<
String
,
Object
>
entry
:
options
.
entrySet
())
{
if
(
queryParams
.
length
()
>
0
)
{
queryParams
.
append
(
"&"
);
}
queryParams
.
append
(
URLEncoder
.
encode
(
entry
.
getKey
(),
"UTF-8"
))
.
append
(
"="
)
.
append
(
URLEncoder
.
encode
(
entry
.
getValue
().
toString
(),
"UTF-8"
));
}
HttpURLConnection
connection
=
getOCRHttpURLConnection
(
queryParams
);
try
(
OutputStream
os
=
connection
.
getOutputStream
())
{
os
.
write
(
fileContent
);
os
.
flush
();
}
int
responseCode
=
connection
.
getResponseCode
();
if
(
responseCode
==
HttpURLConnection
.
HTTP_OK
)
{
try
(
BufferedReader
in
=
new
BufferedReader
(
new
InputStreamReader
(
connection
.
getInputStream
())))
{
StringBuilder
response
=
new
StringBuilder
();
String
inputLine
;
while
((
inputLine
=
in
.
readLine
())
!=
null
)
{
response
.
append
(
inputLine
);
}
return
response
.
toString
();
}
}
else
{
logger
.
error
(
"HTTP request failed with code: {}, Error message view :{}"
,
responseCode
,
"https://www.textin.com/document/pdf_to_markdown"
);
throw
new
IOException
(
"HTTP request failed with code: "
+
responseCode
);
}
}
private
HttpURLConnection
getOCRHttpURLConnection
(
StringBuilder
queryParams
)
throws
IOException
{
String
baseUrl
=
"https://api.textin.com/ai/service/v1/pdf_to_markdown"
;
String
fullUrl
=
baseUrl
+
(
queryParams
.
length
()
>
0
?
"?"
+
queryParams
:
""
);
URL
url
=
new
URL
(
fullUrl
);
HttpURLConnection
connection
=
(
HttpURLConnection
)
url
.
openConnection
();
connection
.
setRequestMethod
(
"POST"
);
connection
.
setRequestProperty
(
"x-ti-app-id"
,
appId
);
connection
.
setRequestProperty
(
"x-ti-secret-code"
,
secretCode
);
connection
.
setRequestProperty
(
"Content-Type"
,
"text/plain;charset=utf-8"
);
connection
.
setDoOutput
(
true
);
return
connection
;
}
/**
* 【合同抽取】-创建抽取
* https://www.textin.com/document/doc_extraction_create
*
* @param fileUrl
* @param keyInfoList
* @return
*/
public
String
extraction
(
String
fileUrl
,
List
<
KeyInfo
>
keyInfoList
)
{
try
{
// 读取文件并将其转换为Base64编码
File
file
=
DocumentLoad
.
downloadURLDocument
(
fileUrl
);
byte
[]
fileData
=
Files
.
readAllBytes
(
file
.
toPath
());
String
base64FileData
=
Base64
.
getEncoder
().
encodeToString
(
fileData
);
// 获取文件名
String
fileName
=
file
.
getName
();
// 构建请求数据
Config
config
=
new
Config
(
"table"
,
"true"
,
"true"
,
"false"
);
RequestData
requestData
=
new
RequestData
(
""
,
config
,
base64FileData
,
fileName
,
keyInfoList
.
toArray
(
new
KeyInfo
[
0
]));
// 创建ObjectMapper对象,序列化Java对象为JSON
ObjectMapper
objectMapper
=
new
ObjectMapper
();
String
requestDataJson
=
objectMapper
.
writeValueAsString
(
requestData
);
// 创建URL对象
URL
url
=
new
URL
(
"https://doc-compare.intsig.com/api/contracts/v3/extraction/external/create"
);
// 打开HTTP连接
HttpURLConnection
connection
=
(
HttpURLConnection
)
url
.
openConnection
();
connection
.
setRequestMethod
(
"POST"
);
connection
.
setRequestProperty
(
"x-ti-app-id"
,
appId
);
connection
.
setRequestProperty
(
"x-ti-secret-code"
,
secretCode
);
connection
.
setRequestProperty
(
"Content-Type"
,
"application/json;charset=utf-8"
);
connection
.
setDoOutput
(
true
);
// 开启输出流
// 发送请求数据
try
(
OutputStream
os
=
connection
.
getOutputStream
())
{
byte
[]
input
=
requestDataJson
.
getBytes
(
StandardCharsets
.
UTF_8
);
os
.
write
(
input
,
0
,
input
.
length
);
}
// 获取响应代码
int
status
=
connection
.
getResponseCode
();
logger
.
info
(
"Response Code: {}"
,
status
);
// 读取响应内容
try
(
BufferedReader
in
=
new
BufferedReader
(
new
InputStreamReader
(
connection
.
getInputStream
())))
{
String
inputLine
;
StringBuilder
response
=
new
StringBuilder
();
while
((
inputLine
=
in
.
readLine
())
!=
null
)
{
response
.
append
(
inputLine
);
}
// 输出响应内容
logger
.
info
(
"Response: {}"
,
response
);
JSONObject
jsonResponse
=
JSONObject
.
parseObject
(
response
.
toString
());
String
taskId
=
jsonResponse
.
getJSONObject
(
"result"
).
getString
(
"task_id"
);
return
extractedResults
(
taskId
);
}
}
catch
(
IOException
e
)
{
throw
new
BusinessException
(
e
);
}
}
/**
* 【合同抽取】 -获取抽取结果
* https://www.textin.com/document/doc_extraction_result
*
* @param taskId
* @return
*/
private
String
extractedResults
(
String
taskId
)
{
String
baseUrl
=
"https://doc-compare.intsig.com/doc_extraction/keyinfo/extracted_results?format=json&task_id="
+
taskId
;
HttpUriRequest
httpUriRequest
=
RequestBuilder
.
post
()
.
setUri
(
baseUrl
)
.
addHeader
(
"x-ti-app-id"
,
appId
)
.
addHeader
(
"x-ti-secret-code"
,
secretCode
)
.
addHeader
(
"Content-Type"
,
"application/json;charset=utf-8"
)
.
build
();
String
result
=
LocalHttpClient
.
executeJsonResult
(
httpUriRequest
,
String
.
class
);
JSONObject
resultJson
=
JSONObject
.
parseObject
(
result
);
Integer
code
=
resultJson
.
getInteger
(
"code"
);
if
(
code
.
equals
(
200
))
{
return
resultJson
.
getJSONObject
(
"result"
).
toJSONString
();
}
else
{
logger
.
error
(
"获取token失败,错误码:{}"
,
code
);
return
StringUtils
.
EMPTY
;
}
}
}
\ No newline at end of file
src/main/java/cn/com/poc/thirdparty/resource/
demand/ai/function/text_in_pdf2md
/entity/PdfToMDResponse.java
→
src/main/java/cn/com/poc/thirdparty/resource/
textin
/entity/PdfToMDResponse.java
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
text_in_pdf2md
.
entity
;
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
textin
.
entity
;
/**
/**
* @author alex.yao
* @author alex.yao
...
...
src/main/java/cn/com/poc/thirdparty/resource/
demand/ai/function/text_in_pdf2md
/entity/PdfToMDResult.java
→
src/main/java/cn/com/poc/thirdparty/resource/
textin
/entity/PdfToMDResult.java
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
.
text_in_pdf2md
.
entity
;
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
textin
.
entity
;
/**
/**
* @author alex.yao
* @author alex.yao
...
...
src/test/java/cn/com/poc/expose/ContentReportTest.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
expose
;
import
cn.com.poc.common.service.BosConfigService
;
import
cn.com.poc.common.utils.UUIDTool
;
import
cn.com.poc.expose.dto.ContentReportDto
;
import
cn.com.poc.expose.rest.ContentReportRest
;
import
cn.com.yict.framemax.core.spring.SingleContextInitializer
;
import
org.apache.poi.poifs.filesystem.DirectoryEntry
;
import
org.apache.poi.poifs.filesystem.DocumentEntry
;
import
org.apache.poi.poifs.filesystem.POIFSFileSystem
;
import
org.junit.runner.RunWith
;
import
org.junit.Test
;
import
org.springframework.test.context.ContextConfiguration
;
import
org.springframework.test.context.junit4.SpringJUnit4ClassRunner
;
import
org.springframework.test.context.web.WebAppConfiguration
;
import
javax.annotation.Resource
;
import
java.io.*
;
/**
* @author alex.yao
* @date 2025/5/12
*/
@RunWith
(
SpringJUnit4ClassRunner
.
class
)
@ContextConfiguration
(
initializers
=
SingleContextInitializer
.
class
)
@WebAppConfiguration
public
class
ContentReportTest
{
@Resource
private
ContentReportRest
contentReportRest
;
@Test
public
void
test_report
()
throws
IOException
{
String
content
=
"<html><body><p>在Markdown中,你可以使用LaTeX语法来输出数学公式,包括三角函数公式。要在Markdown中插入LaTeX公式,你需要使用<code>$</code>符号将公式包围起来。对于行内公式,使用单个<code>$</code>符号,而对于独立的公式块,使用两个<code>$$</code>符号。</p><p>下面是一些三角函数公式的例子:</p><h3>行内公式</h3><ul><li>正弦函数:<code>$\\sin(x)$</code></li><li>余弦函数:<code>$\\cos(x)$</code></li><li>正切函数:<code>$\\tan(x)$</code></li></ul><p>将以上代码插入Markdown文档中,将会得到相应的行内公式。</p><h3>公式块</h3><p>如果你想让公式独占一行并居中显示,可以使用两个<code>$$</code>符号来创建一个公式块。</p><p>例如:</p><pre> </pre><div class=\"code-render-container\"><div class=\"code-operation-bar-container\"><span class=\"language\">markdown</span></div><div class=\"code-render-wrapper\"><pre class=\"code-render-inner\"><code>$$\n"
+
"\\sin^2(x) + \\cos^2(x) = 1\n"
+
"$$</code></pre></div></div><pre><code class=\"hljs code-container-wrapper language-markdown\">\n"
+
"</code></pre><pre> </pre><div class=\"code-render-container\"><div class=\"code-operation-bar-container\"><span class=\"language\">markdown</span></div><div class=\"code-render-wrapper\"><pre class=\"code-render-inner\"><code>$$\n"
+
"\\tan(x) = \\frac{\\sin(x)}{\\cos(x)}\n"
+
"$$</code></pre></div></div><pre><code class=\"hljs code-container-wrapper language-markdown\">\n"
+
"</code></pre><p>在渲染后的Markdown文档中,这些代码将会生成独立的公式块,其中包含相应的三角函数公式。</p><p>请注意,为了正确渲染LaTeX公式,你使用的Markdown编辑器或查看器需要支持LaTeX渲染。许多流行的Markdown编辑器,如Typora、VS Code(配合扩展),以及在线Markdown编辑器如StackEdit,都支持LaTeX公式的渲染。如果你使用的是不支持LaTeX的编辑器或查看器,你可能需要寻找其他解决方案或转换工具来查看渲染后的公式。</p></body></html>"
;
String
reportType
=
"doc"
;
ContentReportDto
dto
=
new
ContentReportDto
();
dto
.
setContent
(
content
);
dto
.
setReportType
(
reportType
);
System
.
out
.
println
(
contentReportRest
.
report
(
dto
));
}
@Resource
private
BosConfigService
bosConfigService
;
@Test
public
void
test_report2
()
throws
IOException
{
String
content
=
"<h1>标题头</h1><h2>第二个标题</h2><a href=\"www.baidu.com\">百度搜索</a>"
;
StringBuffer
sbf
=
new
StringBuffer
();
sbf
.
append
(
"<html><body>"
);
sbf
.
append
(
content
);
sbf
.
append
(
"</body></html"
);
System
.
out
.
println
(
exportWord
(
sbf
.
toString
()));
}
/**
* @param content 富文本内容
* @throws Exception
*/
public
String
exportWord
(
String
content
)
throws
IOException
{
byte
b
[]
=
content
.
getBytes
(
"GBK"
);
//这里是必须要设置编码的,不然导出中文就会乱码。
ByteArrayInputStream
bais
=
new
ByteArrayInputStream
(
b
);
//将字节数组包装到流中
POIFSFileSystem
poifs
=
new
POIFSFileSystem
();
DirectoryEntry
directory
=
poifs
.
getRoot
();
DocumentEntry
documentEntry
=
directory
.
createDocument
(
"WordDocument"
,
bais
);
//该步骤不可省略,否则会出现乱码。
//输出文件
File
file
=
File
.
createTempFile
(
UUIDTool
.
getUUID
(),
".docx"
);
FileOutputStream
ostream
=
new
FileOutputStream
(
file
);
poifs
.
writeFilesystem
(
ostream
);
FileInputStream
fileInputStream
=
new
FileInputStream
(
file
);
String
upload
=
bosConfigService
.
upload
(
fileInputStream
,
"docx"
,
"application/msword"
);
bais
.
close
();
ostream
.
close
();
poifs
.
close
();
fileInputStream
.
close
();
return
upload
;
}
}
src/test/java/cn/com/poc/thirdparty/resource/demand/ai/function/LongDocumentReaderFunctionTest.java
0 → 100644
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.long_document_reader.LongDocumentReaderFunction
;
import
cn.com.yict.framemax.core.spring.SingleContextInitializer
;
import
org.junit.runner.RunWith
;
import
org.junit.Test
;
import
org.springframework.test.context.ContextConfiguration
;
import
org.springframework.test.context.junit4.SpringJUnit4ClassRunner
;
import
org.springframework.test.context.web.WebAppConfiguration
;
import
javax.annotation.Resource
;
import
java.util.UUID
;
/**
* @author alex.yao
* @date 2025/5/12
*/
@RunWith
(
SpringJUnit4ClassRunner
.
class
)
@ContextConfiguration
(
initializers
=
SingleContextInitializer
.
class
)
@WebAppConfiguration
public
class
LongDocumentReaderFunctionTest
{
@Resource
LongDocumentReaderFunction
longDocumentReaderFunction
;
@Test
public
void
test_function
(){
String
content
=
"{\"file_url\": \"https://gsst-poe-sit.gz.bcebos.com/data/20250410/1744277235901.pdf\",\"question\":\"Can a registered Grade C electrical worker work on the electrical work of a Grade A electrical worker?\"}"
;
String
identifier
=
UUID
.
randomUUID
().
toString
();
AbstractFunctionResult
<
String
>
result
=
longDocumentReaderFunction
.
doFunction
(
content
,
identifier
);
System
.
out
.
println
(
result
.
getFunctionResult
());
}
}
src/test/java/cn/com/poc/thirdparty/resource/demand/ai/function/PdfToMdFunctionTest.java
View file @
ff89faec
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
;
package
cn
.
com
.
poc
.
thirdparty
.
resource
.
demand
.
ai
.
function
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.text_in_pdf2md.api.OCRClient
;
import
cn.com.poc.thirdparty.resource.demand.ai.function.extraction.ContractExtractionFunction
;
import
cn.com.poc.thirdparty.resource.textin.api.TextInClient
;
import
cn.com.yict.framemax.core.spring.SingleContextInitializer
;
import
cn.com.yict.framemax.core.spring.SingleContextInitializer
;
import
com.fasterxml.jackson.databind.JsonNode
;
import
com.fasterxml.jackson.databind.JsonNode
;
import
com.fasterxml.jackson.databind.ObjectMapper
;
import
com.fasterxml.jackson.databind.ObjectMapper
;
...
@@ -10,6 +11,7 @@ import org.springframework.test.context.ContextConfiguration;
...
@@ -10,6 +11,7 @@ import org.springframework.test.context.ContextConfiguration;
import
org.springframework.test.context.junit4.SpringJUnit4ClassRunner
;
import
org.springframework.test.context.junit4.SpringJUnit4ClassRunner
;
import
org.springframework.test.context.web.WebAppConfiguration
;
import
org.springframework.test.context.web.WebAppConfiguration
;
import
javax.annotation.Resource
;
import
java.nio.charset.StandardCharsets
;
import
java.nio.charset.StandardCharsets
;
import
java.util.HashMap
;
import
java.util.HashMap
;
...
@@ -40,17 +42,28 @@ public class PdfToMdFunctionTest {
...
@@ -40,17 +42,28 @@ public class PdfToMdFunctionTest {
options
.
put
(
"paratext_mode"
,
"annotation"
);
options
.
put
(
"paratext_mode"
,
"annotation"
);
options
.
put
(
"parse_mode"
,
"auto"
);
options
.
put
(
"parse_mode"
,
"auto"
);
options
.
put
(
"table_flavor"
,
"md"
);
options
.
put
(
"table_flavor"
,
"md"
);
OCRClient
client
=
new
OCRClient
();
try
{
try
{
String
response
=
client
.
recognize
(
fileContent
,
options
);
TextInClient
textInClient
=
new
TextInClient
();
String
response
=
textInClient
.
OCR
(
fileContent
,
options
);
ObjectMapper
mapper
=
new
ObjectMapper
();
ObjectMapper
mapper
=
new
ObjectMapper
();
JsonNode
jsonNode
=
mapper
.
readTree
(
response
);
JsonNode
jsonNode
=
mapper
.
readTree
(
response
);
if
(
jsonNode
.
has
(
"result"
)
&&
jsonNode
.
get
(
"result"
).
has
(
"markdown"
))
{
if
(
jsonNode
.
has
(
"result"
)
&&
jsonNode
.
get
(
"result"
).
has
(
"markdown"
))
{
String
markdown
=
jsonNode
.
get
(
"result"
).
get
(
"markdown"
).
asText
();
String
markdown
=
jsonNode
.
get
(
"result"
).
get
(
"markdown"
).
asText
();
System
.
out
.
println
(
markdown
);
System
.
out
.
println
(
markdown
);
}
else
{
System
.
out
.
println
(
response
);
}
}
}
catch
(
Exception
e
)
{
}
catch
(
Exception
e
)
{
System
.
out
.
println
(
"1111111"
);
e
.
printStackTrace
(
);
}
}
}
}
@Resource
private
ContractExtractionFunction
contractExtractionFunction
;
@Test
public
void
test_cefunction
()
{
System
.
out
.
println
(
contractExtractionFunction
.
getLLMConfig
());
}
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment